You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Phase D part 2: bump Nx/EXLA/Bumblebee, accept :emlx; v0.2.0
This commit picks up the Nx-side fix that closes the Apple Silicon
SVD-OOM blocker upstream and bumps the surrounding dep stack
accordingly. Validated end-to-end on CUDA; validated end-to-end on
Apple by polvalente (Nx core) prior to this commit.
Deps:
- nx pinned to GitHub elixir-nx/nx@6424c89
(post-v0.12.0 main, carries
elixir-nx/nx#1753 — better memory footprint
for thin SVD; both EMLX and EXLA benefit; the Apple OOM on the
151,936 x 1024 embedder is gone).
- exla pinned to the same Nx repo at the same commit (sparse: 'exla',
v0.12.0).
- bumblebee bumped from
github.com/elixir-nx/bumblebee@0fd8114 (pre-v0.7.0) to
github.com/elixir-nx/bumblebee@d0774e8 (post-v0.7.0 main; required
for Nx 0.12 compat).
- xla 0.10.x is the resolved version (was 0.9.x). cuda13 is newly
accepted by the XLA preflight; cuda12 remains recommended default.
- EMLX is deliberately NOT in our deps. optional: true does not
prevent Mix from starting it on Linux/CUDA hosts where its Metal/MLX
NIF cannot load. Apple users add {:emlx, '~> 0.3'} to their parent
app; the :emlx runtime profile resolves the backend at runtime via
Code.ensure_loaded?/1.
Documentation:
- README, guides/onboarding.md, docs/production_qwen_slm_profile.md
updated with the new resolved dep versions.
- guides/troubleshooting.md 'XLA_TARGET=cuda13' section rewritten:
cuda13 is now accepted; the rejection example uses cuda14.
- guides/troubleshooting.md 'EMLX OOM on the embedder SVD' section
rewritten to credit both fixes (EMLX 0.3.0 thin-SVD fall-through +
Nx PR #1753 default-impl refactor).
- guides/runtime_profiles.md EMLX caveats section updated to credit
PR #1753 and note that polvalente confirmed end-to-end Apple
validation (37/37 prompt eval pass) without EMLXAxon rewrites.
- docs/bumblebee_unpin_playbook.md updated to reflect the new ref.
Tests:
- test/build_support/xla_target_validator_test.exs updated for the
cuda13-now-accepted reality and the new bundled xla 0.10.x message.
Version:
- mix.exs @Version 0.1.0 -> 0.2.0.
- CHANGELOG: new 0.2.0 entry summarising Phase B, C, D.
Gates (all green on CUDA):
- mix format
- mix compile --warnings-as-errors
- mix test: 262 tests, 0 failures (was 261; +1 'accepts cuda13' case)
- mix credo --strict: 0 issues
- mix dialyzer: 0 errors
- mix docs --warnings-as-errors clean
- XLA_TARGET=cuda12 mix run examples/qwen_router_prompt_eval.exs
--snapshot examples/fixtures/qwen_router_prompt_eval_logits.json
--determinism-runs 2 -> 37/37 PASS
0 commit comments