Skip to content

Commit 017643e

Browse files
committed
Phase D part 2: bump Nx/EXLA/Bumblebee, accept :emlx; v0.2.0
This commit picks up the Nx-side fix that closes the Apple Silicon SVD-OOM blocker upstream and bumps the surrounding dep stack accordingly. Validated end-to-end on CUDA; validated end-to-end on Apple by polvalente (Nx core) prior to this commit. Deps: - nx pinned to GitHub elixir-nx/nx@6424c89 (post-v0.12.0 main, carries elixir-nx/nx#1753 — better memory footprint for thin SVD; both EMLX and EXLA benefit; the Apple OOM on the 151,936 x 1024 embedder is gone). - exla pinned to the same Nx repo at the same commit (sparse: 'exla', v0.12.0). - bumblebee bumped from github.com/elixir-nx/bumblebee@0fd8114 (pre-v0.7.0) to github.com/elixir-nx/bumblebee@d0774e8 (post-v0.7.0 main; required for Nx 0.12 compat). - xla 0.10.x is the resolved version (was 0.9.x). cuda13 is newly accepted by the XLA preflight; cuda12 remains recommended default. - EMLX is deliberately NOT in our deps. optional: true does not prevent Mix from starting it on Linux/CUDA hosts where its Metal/MLX NIF cannot load. Apple users add {:emlx, '~> 0.3'} to their parent app; the :emlx runtime profile resolves the backend at runtime via Code.ensure_loaded?/1. Documentation: - README, guides/onboarding.md, docs/production_qwen_slm_profile.md updated with the new resolved dep versions. - guides/troubleshooting.md 'XLA_TARGET=cuda13' section rewritten: cuda13 is now accepted; the rejection example uses cuda14. - guides/troubleshooting.md 'EMLX OOM on the embedder SVD' section rewritten to credit both fixes (EMLX 0.3.0 thin-SVD fall-through + Nx PR #1753 default-impl refactor). - guides/runtime_profiles.md EMLX caveats section updated to credit PR #1753 and note that polvalente confirmed end-to-end Apple validation (37/37 prompt eval pass) without EMLXAxon rewrites. - docs/bumblebee_unpin_playbook.md updated to reflect the new ref. Tests: - test/build_support/xla_target_validator_test.exs updated for the cuda13-now-accepted reality and the new bundled xla 0.10.x message. Version: - mix.exs @Version 0.1.0 -> 0.2.0. - CHANGELOG: new 0.2.0 entry summarising Phase B, C, D. Gates (all green on CUDA): - mix format - mix compile --warnings-as-errors - mix test: 262 tests, 0 failures (was 261; +1 'accepts cuda13' case) - mix credo --strict: 0 issues - mix dialyzer: 0 errors - mix docs --warnings-as-errors clean - XLA_TARGET=cuda12 mix run examples/qwen_router_prompt_eval.exs --snapshot examples/fixtures/qwen_router_prompt_eval_logits.json --determinism-runs 2 -> 37/37 PASS
1 parent ec53de8 commit 017643e

11 files changed

Lines changed: 237 additions & 92 deletions

CHANGELOG.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,84 @@
11
# Changelog
22

3+
## 0.2.0 — 2026-05-21
4+
5+
### Highlights
6+
7+
This release ships profile-driven backend selection (the foundation for
8+
Apple Silicon / EMLX support), the canonical `mix trinity.artifact.fetch`
9+
onboarding command, and a dep-stack bump that picks up
10+
[Nx PR #1753](https://github.com/elixir-nx/nx/pull/1753) (better memory
11+
footprint for thin SVD; the Apple-side blocker is now fixed upstream).
12+
13+
### Added
14+
15+
- `mix trinity.artifact.fetch` task and `TrinityCoordinator.ArtifactFetch`
16+
module. Downloads the adapted-Qwen3 bundle from a HuggingFace dataset
17+
repo with per-file SHA-256 verification. The pin file at
18+
`priv/sakana_trinity/artifact_pin.json` is committed to the repo so a
19+
fresh clone knows what to fetch before touching the network.
20+
- `TrinityCoordinator.RuntimeProfile.put_default_backend!/1` — single
21+
bottleneck through which all backend bootstrapping flows. Profiles
22+
whose backend module is not loaded raise a single informative error
23+
naming the missing dep.
24+
- `TrinityCoordinator.RuntimeProfile.accepts_backend_label?/2` — used
25+
by the exporter's per-tensor validation. Generalises the previous
26+
CUDA-only check.
27+
- Real `:emlx` runtime profile. `nx_backend == {EMLX.Backend, device: :gpu}`.
28+
Resolves to a working profile struct; users on Apple Silicon add
29+
`{:emlx, "~> 0.3"}` to their parent app and pass
30+
`--runtime-profile emlx` to the relevant Mix tasks / examples.
31+
- `--runtime-profile NAME` flag on `mix trinity.sakana.export_adapted`,
32+
`mix trinity.sakana.router_trace`, and
33+
`examples/qwen_router_prompt_eval.exs`. Default `cuda_exla` for
34+
back-compat.
35+
- New guides: `guides/runtime_profiles.md`, `guides/artifact_distribution.md`.
36+
- Three new troubleshooting sections in `guides/troubleshooting.md`:
37+
artifact-fetch failure modes, EMLX dep missing, EMLX OOM on the
38+
embedder SVD (with the two upstream fixes).
39+
- `hf_hub ~> 0.2` dep (already on Hex).
40+
41+
### Changed
42+
43+
- **Dep stack bump.** `nx` pinned to GitHub
44+
`elixir-nx/nx@6424c8902380380cd7a8c282b0557d653aead018` (post-v0.12.0
45+
main, carries PR #1753 thin-SVD memory fix). `exla` pinned to the
46+
same commit (sparse: "exla"). `bumblebee` pinned to
47+
`elixir-nx/bumblebee@d0774e8ab8c4d5ac60ade95ec8dc9e1f0efd7306`
48+
(post-v0.7.0 main).
49+
- **`xla` 0.10.x** is now the bundled version (was 0.9.x). `cuda13` is
50+
newly accepted by the XLA preflight; `cuda12` remains the recommended
51+
default.
52+
- `SLMProfile.qwen_coordinator/0` and `SLMProfile.qwen_sakana_adapted/0`
53+
`load_options` no longer bake in `backend: {EXLA.Backend, client: :cuda}`.
54+
They carry only `type: :bf16`; `Coordinator.load/1` injects the
55+
runtime profile's backend at load time.
56+
- `Sakana.Exporter.ensure_cuda_backend/2`
57+
`ensure_export_backend/3`. Threads runtime profile through
58+
`export_tensor/5` so per-tensor backend validation matches the profile
59+
under which the export ran.
60+
- `Sakana.Exporter.load_profile/1``load_profile/2`. Uses
61+
`RuntimeProfile.put_default_backend!/1` instead of CUDA-hard-coded
62+
`Runtime.put_cuda_backend!/0`.
63+
- README `Model And Artifact Setup` rewritten to lead with
64+
`mix trinity.artifact.fetch` instead of "use a blessed artifact bundle".
65+
66+
### Fixed
67+
68+
- Apple Silicon export OOM on the Qwen3-0.6B embedder
69+
(151,936 × 1024 → (92 GB U). Upstream fix in Nx PR
70+
#1753 (now in our pin). Validated end-to-end on Apple by Paulo
71+
Valente (polvalente, Nx core team) on 2026-05-21: full export +
72+
37/37 prompt eval pass.
73+
74+
### Notes
75+
76+
- `{:emlx, "~> 0.3"}` is deliberately NOT in `mix.exs`. Marking it
77+
`optional: true` would still fetch and start it on Linux/CUDA hosts
78+
where its Metal/MLX NIF cannot load. Apple users add the dep to
79+
their parent app; the `:emlx` runtime profile resolves the backend
80+
at runtime via `Code.ensure_loaded?/1`.
81+
382
## 2026-05-21
483

584
### Added

README.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -269,11 +269,13 @@ workspaces, but they are not required for fresh-clone onboarding.
269269

270270
Resolved core dependency lane:
271271

272-
- `nx 0.10.0`
273-
- `exla 0.10.0`
272+
- `nx 0.12.0` (pinned to GitHub main commit
273+
`6424c8902380380cd7a8c282b0557d653aead018` for
274+
[PR #1753 thin SVD memory fix](https://github.com/elixir-nx/nx/pull/1753))
275+
- `exla 0.12.0` (pinned to the same commit)
274276
- `axon 0.7.0`
275-
- `bumblebee` pinned to `elixir-nx/bumblebee`
276-
`0fd8114cf5429af9236f100f3350986e9d823c02`
277+
- `bumblebee` pinned to `elixir-nx/bumblebee` commit
278+
`d0774e8ab8c4d5ac60ade95ec8dc9e1f0efd7306` (post-v0.7.0 main)
277279

278280
## Quick Verification
279281

build_support/xla_target_validator.exs

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,13 @@ defmodule XlaTargetValidator do
1616
`mix trinity.env.check`).
1717
1818
The recognised target list is intentionally kept in lock-step with the
19-
bundled `xla` version. As of `xla 0.9.x`, the supported set is
20-
`cpu`, `cuda`, `cuda12`, `rocm`, `tpu`. The newer `xla 0.10.x` adds
21-
`cuda13`; that bump is tracked separately (see
22-
`docs/bumblebee_unpin_playbook.md`).
19+
bundled `xla` version. As of `xla 0.10.x` (used by EXLA 0.12+), the
20+
supported set is `cpu`, `cuda`, `cuda12`, `cuda13`, `rocm`, `tpu`.
21+
`cuda12` remains the recommended default for CUDA hosts; `cuda13` is
22+
newly accepted (the previous bundled `xla 0.9.x` rejected it).
2323
"""
2424

25-
@supported_xla_targets ["cpu", "cuda", "cuda12", "rocm", "tpu"]
25+
@supported_xla_targets ["cpu", "cuda", "cuda12", "cuda13", "rocm", "tpu"]
2626
@recommended "cuda12"
2727

2828
@doc "Validates `XLA_TARGET`. Returns `:ok` or raises a `Mix.Error`."
@@ -77,7 +77,7 @@ defmodule XlaTargetValidator do
7777
accepted = Enum.map_join(@supported_xla_targets, ", ", &inspect/1)
7878

7979
Mix.raise(
80-
"XLA_TARGET=#{inspect(value)} is not accepted by the bundled xla 0.9.x. " <>
80+
"XLA_TARGET=#{inspect(value)} is not accepted by the bundled xla 0.10.x. " <>
8181
"Accepted values: #{accepted}. " <>
8282
"Recommended for CUDA hosts: export XLA_TARGET=#{@recommended}. " <>
8383
"Recommended for CPU hosts: unset XLA_TARGET (or use cpu). " <>

docs/bumblebee_unpin_playbook.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,13 @@
11
# Bumblebee Unpin Playbook
22

33
This playbook is the 15-minute job to take when a Bumblebee Hex release
4-
lands that includes Qwen3 support at or after commit
5-
`0fd8114cf5429af9236f100f3350986e9d823c02`.
4+
lands that includes Qwen3 support at or after Bumblebee v0.7.0.
65

7-
Until then, `mix.exs` pins Bumblebee to that commit on
8-
`elixir-nx/bumblebee`, and `mix trinity.gates --include-hex-build` treats
9-
the `hex_build_advisory` step as non-blocking by design.
6+
As of 2026-05-21, `mix.exs` pins Bumblebee to commit
7+
`d0774e8ab8c4d5ac60ade95ec8dc9e1f0efd7306` on `elixir-nx/bumblebee`
8+
(post-v0.7.0 main; carries Qwen3 + Nx 0.12 compat needed for EMLX
9+
support). `mix trinity.gates --include-hex-build` treats the
10+
`hex_build_advisory` step as non-blocking by design.
1011

1112
This playbook re-promotes that gate to blocking once Bumblebee is
1213
unpinned.

docs/production_qwen_slm_profile.md

Lines changed: 17 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -86,33 +86,38 @@ model = TrinityCoordinator.CoordinationHead.build_model(hidden_size, num_agents,
8686

8787
The checked-in dependency lane is:
8888

89-
- `bumblebee` pinned to upstream `elixir-nx/bumblebee`
90-
`0fd8114cf5429af9236f100f3350986e9d823c02`
89+
- `nx` pinned to GitHub `elixir-nx/nx`
90+
`6424c8902380380cd7a8c282b0557d653aead018` (post-v0.12.0 main,
91+
carries [PR #1753](https://github.com/elixir-nx/nx/pull/1753) thin-SVD
92+
memory-footprint fix). When Nx 0.13 lands on Hex, the pin moves to
93+
`{:nx, "~> 0.13"}`.
94+
- `exla` pinned to the same Nx repo at the same commit (sparse: "exla").
9195
- `axon ~> 0.7`
92-
- `nx ~> 0.9`
93-
- `exla ~> 0.9`
96+
- `bumblebee` pinned to upstream `elixir-nx/bumblebee`
97+
`d0774e8ab8c4d5ac60ade95ec8dc9e1f0efd7306` (post-v0.7.0 main).
9498

95-
On this host, that lane is verified with `XLA_TARGET=cuda12`. Hex
96-
`bumblebee 0.6.3` does not ship Qwen3, so this repo pins the upstream Bumblebee
97-
commit that includes `Bumblebee.Text.Qwen3` and its Hugging Face parameter
98-
mapping.
99+
On this host, that lane is verified with `XLA_TARGET=cuda12`. Bumblebee
100+
v0.7.0 ships Qwen3 via Hex; the post-main pin picks up minor fixes
101+
landed after the release.
99102

100103
### `qwen_cuda_ready` outcome
101104

102105
Current resolved versions used for this outcome:
103106

104-
- `bumblebee` git ref `0fd8114cf5429af9236f100f3350986e9d823c02`
107+
- `nx 0.12.0` (GitHub commit above)
108+
- `exla 0.12.0` (GitHub commit above)
105109
- `axon 0.7.0`
106-
- `nx 0.10.0`
107-
- `exla 0.10.0`
110+
- `bumblebee` git ref `d0774e8ab8c4d5ac60ade95ec8dc9e1f0efd7306`
108111

109112
Outcome: `qwen_cuda_ready` is active for base Qwen hidden-state extraction.
110113
`SLMProfile.qwen_coordinator/0` uses:
111114

112115
- repo: `{:hf, "Qwen/Qwen3-0.6B"}`
113116
- module: `Bumblebee.Text.Qwen3`
114117
- architecture: `:for_causal_language_modeling`
115-
- load options: `backend: {EXLA.Backend, client: :cuda}`, `type: :bf16`
118+
- load options: `type: :bf16` (the backend is injected at load time
119+
by `Coordinator.load/1` based on the active `RuntimeProfile`;
120+
see `guides/runtime_profiles.md`).
116121
- expected hidden size: `1024`
117122

118123
Hidden states are enabled at prediction time with Axon's global layer option

guides/onboarding.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -65,11 +65,13 @@ The current development lane assumes:
6565

6666
The resolved Elixir dependency lane currently uses:
6767

68-
- `nx 0.10.0`
69-
- `exla 0.10.0`
68+
- `nx 0.12.0` (pinned to GitHub main commit
69+
`6424c8902380380cd7a8c282b0557d653aead018` for
70+
[PR #1753 thin SVD memory fix](https://github.com/elixir-nx/nx/pull/1753))
71+
- `exla 0.12.0` (pinned to the same commit)
7072
- `axon 0.7.0`
71-
- `bumblebee` pinned to `elixir-nx/bumblebee` ref
72-
`0fd8114cf5429af9236f100f3350986e9d823c02`
73+
- `bumblebee` pinned to `elixir-nx/bumblebee` commit
74+
`d0774e8ab8c4d5ac60ade95ec8dc9e1f0efd7306` (post-v0.7.0 main)
7375

7476
## First Commands
7577

guides/runtime_profiles.md

Lines changed: 16 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -57,21 +57,27 @@ mix run examples/qwen_router_prompt_eval.exs --runtime-profile emlx \
5757

5858
#### EMLX-specific Caveats
5959

60-
- **Thin SVD.** EMLX v0.3.0 routes `Nx.LinAlg.svd/2` with
61-
`full_matrices?: false` through Nx's default implementation, which
62-
avoids materialising the full `m × m` U on the Qwen3-0.6B embedder
63-
(where `m = 151_936`, i.e. ~92 GB of U). The default path uses
64-
`eigh`; for the small-σ tail to stay precise, pass
65-
`--svd-compute-type f32` to `mix trinity.sakana.export_adapted` on
66-
Apple.
60+
- **Thin SVD memory footprint.** Nx main as of commit `6424c89` (Paulo
61+
Valente, [PR #1753](https://github.com/elixir-nx/nx/pull/1753))
62+
refactored `Nx.LinAlg.svd/2` with `full_matrices?: false` so it does
63+
not materialise the full `m × m` U on the Qwen3-0.6B embedder
64+
(where `m = 151_936`, i.e. (92 GB of U under the old path).
65+
This fix is in the Nx version that `trinity_coordinator` pins to.
66+
Both EMLX and EXLA benefit from this change.
67+
- **`--svd-compute-type f32`.** Recommended on Apple. The thin-SVD
68+
path uses an `eigh` decomposition under the hood; doing that work
69+
in f32 keeps the small-σ tail precise.
6770
- **Backend label.** When the exporter validates per-tensor backend
6871
during the SVD reconstruction step, it accepts the
6972
`"EMLX.Backend"` label as well as `"EXLA.Backend<cuda:"`. No code
7073
changes needed for the user.
7174
- **Bumblebee Qwen3 support.** Bumblebee is git-pinned to a Qwen3-
72-
supporting commit (`mix.exs`). EMLXAxon
73-
(`https://github.com/elixir-nx/emlx`) has independently validated
74-
Qwen3-0.6B loading through the EMLX backend.
75+
supporting commit (post-v0.7.0 main). EMLXAxon
76+
([github.com/elixir-nx/emlx](https://github.com/elixir-nx/emlx)) has
77+
independently validated Qwen3-0.6B loading through the EMLX backend.
78+
Paulo Valente confirmed on 2026-05-21 that running with the bare
79+
EMLX backend (no `EMLXAxon.rewrite/1`) successfully exports and
80+
passes 37/37 on the prompt eval.
7581
- **bf16 round-trip.** The bundle is bf16 safetensors. EMLX accepts
7682
bf16 natively (`{:bf, 16}` ↔ MLX `bfloat16`). No quantisation or
7783
type cast required.

guides/troubleshooting.md

Lines changed: 38 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -166,29 +166,37 @@ If CUDA is missing, verify:
166166
- EXLA dependency target;
167167
- environment isolation, especially shells launched without CUDA env vars.
168168

169-
## XLA_TARGET=cuda13 Is Rejected At Compile Time
169+
## XLA_TARGET Rejected At Compile Time
170170

171-
`xla 0.9.1` does not accept `cuda13`. If `mix deps.compile xla` reports
172-
an unsupported target, set:
171+
`xla 0.10.x` (which EXLA 0.12+ uses) accepts:
172+
173+
```text
174+
cpu, cuda, cuda12, cuda13, rocm, tpu
175+
```
176+
177+
Anything else is rejected at compile time. The most common error mode
178+
is a stale shell export like `XLA_TARGET=cuda14`. Set:
173179

174180
```bash
175181
export XLA_TARGET=cuda12
176182
```
177183

178-
This applies even on hosts whose installed CUDA toolkit is 13.x. The
179-
`XLA_TARGET` controls which prebuilt XLA artifact is fetched; mismatched
180-
host CUDA installations are tolerated by EXLA via dynamic loading.
184+
`cuda12` is the canonical recommended default for CUDA hosts even when
185+
the host installed toolkit is 13.x — the `XLA_TARGET` controls which
186+
prebuilt XLA artifact is fetched; mismatched host CUDA installations
187+
are tolerated by EXLA via dynamic loading. Use `cuda13` when you
188+
specifically want the cuda13 prebuilt.
181189

182190
### Automatic preflight
183191

184-
As of 2026-05-21, the project surfaces this automatically via a Mix
192+
The project surfaces unsupported targets automatically via a Mix
185193
preflight that runs from `mix.exs` before any compilation step. An
186-
operator whose shell exports `XLA_TARGET=cuda13` will see a single
187-
readable line instead of an EXLA stacktrace:
194+
operator whose shell exports an unsupported `XLA_TARGET` will see a
195+
single readable line instead of an EXLA stacktrace:
188196

189197
```text
190-
** (Mix.Error) XLA_TARGET="cuda13" is not accepted by the bundled xla 0.9.x.
191-
Accepted values: "cpu", "cuda", "cuda12", "rocm", "tpu".
198+
** (Mix.Error) XLA_TARGET="cuda14" is not accepted by the bundled xla 0.10.x.
199+
Accepted values: "cpu", "cuda", "cuda12", "cuda13", "rocm", "tpu".
192200
Recommended for CUDA hosts: export XLA_TARGET=cuda12.
193201
Recommended for CPU hosts: unset XLA_TARGET (or use cpu).
194202
The bundled xla rejects unrecognised targets at compile time, so EXLA
@@ -295,9 +303,22 @@ precise.
295303

296304
### EMLX OOM on the embedder SVD
297305

298-
If your EMLX version is older than v0.3.0, the native SVD path
299-
materialises the full `m × m` U matrix on the Qwen3-0.6B embedder
300-
(`m = 151_936`, ~92 GB). Upgrade to `{:emlx, "~> 0.3"}` — Paulo
301-
Valente's commit `3482b79` ("fix: use nx-defined implementation for
302-
non-full svd computation") routes `full_matrices?: false` through
303-
Nx's default path and keeps the work at `min(m, n)² = 1024²`.
306+
The Qwen3-0.6B embedder is `151_936 × 1024`. Before two fixes
307+
landed, the SVD of this matrix tried to materialise a full `m × m`
308+
U matrix — about 92 GB. The fix landed in two places:
309+
310+
1. **EMLX v0.3.0** routed `Nx.LinAlg.svd/2` with `full_matrices?: false`
311+
through Nx's default implementation instead of MLX's native SVD
312+
(which always allocates the full U). Commit `3482b79`, Paulo
313+
Valente, "fix: use nx-defined implementation for non-full svd
314+
computation".
315+
2. **Nx main commit `6424c89`**
316+
([PR #1753](https://github.com/elixir-nx/nx/pull/1753)) refactored
317+
the default thin-SVD path itself to keep the working set bounded
318+
by `min(m, n)²`. Both EMLX and EXLA benefit from this fix.
319+
320+
`trinity_coordinator` pins the post-#1753 Nx (see `mix.exs`), so a
321+
user who runs `mix trinity.sakana.export_adapted --runtime-profile emlx`
322+
on Apple Silicon with `{:emlx, "~> 0.3"}` in their parent app gets
323+
the bounded-memory path automatically. If you see an OOM, confirm
324+
your Nx version: it should be 0.12.x or later.

mix.exs

Lines changed: 32 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ XlaTargetValidator.validate_root_project!(__DIR__)
3131
defmodule TrinityCoordinator.MixProject do
3232
use Mix.Project
3333

34-
@version "0.1.0"
34+
@version "0.2.0"
3535

3636
def project do
3737
[
@@ -79,16 +79,41 @@ defmodule TrinityCoordinator.MixProject do
7979
[
8080
# {:dep_from_hexpm, "~> 0.3.0"},
8181
# {:dep_from_git, git: "https://github.com/elixir-lang/my_dep.git", tag: "0.1.0"}
82-
{:nx, "~> 0.9"},
82+
# Nx is pinned to GitHub main to pick up
83+
# https://github.com/elixir-nx/nx/pull/1753 (refactor: better memory
84+
# footprint for thin svd, polvalente, 2026-05). The thin-SVD path
85+
# avoids materialising the full m×m U matrix on the Qwen3-0.6B
86+
# embedder (m = 151,936) regardless of backend, which is what
87+
# makes the Apple/EMLX export viable without OOM. CUDA also
88+
# benefits (smaller working set during the embedder factorisation).
89+
# Pin moves to {:nx, "~> 0.13"} once Nx 0.13 is on Hex.
90+
{:nx,
91+
github: "elixir-nx/nx",
92+
sparse: "nx",
93+
ref: "6424c8902380380cd7a8c282b0557d653aead018",
94+
override: true},
95+
# EXLA pulled from the same Nx repo so the in-tree :nx version
96+
# matches what EXLA expects (both at 0.12 + thin-SVD PR).
97+
{:exla,
98+
github: "elixir-nx/nx",
99+
sparse: "exla",
100+
ref: "6424c8902380380cd7a8c282b0557d653aead018",
101+
override: true},
83102
{:axon, "~> 0.7"},
84-
# Pinned to a Qwen3-supporting commit until a Bumblebee Hex release
85-
# lands that includes it. To unpin, follow
86-
# docs/bumblebee_unpin_playbook.md.
103+
# Bumblebee main (post-v0.7.0). Qwen3 is on Hex via v0.7.0 but
104+
# main has additional fixes; once Hex 0.8 lands, switch to
105+
# {:bumblebee, "~> 0.8"} per docs/bumblebee_unpin_playbook.md.
87106
{:bumblebee,
88107
github: "elixir-nx/bumblebee",
89-
ref: "0fd8114cf5429af9236f100f3350986e9d823c02",
108+
ref: "d0774e8ab8c4d5ac60ade95ec8dc9e1f0efd7306",
90109
override: true},
91-
{:exla, "~> 0.9"},
110+
# NOTE: EMLX is deliberately NOT listed here. Marking it
111+
# optional: true would still cause Mix to fetch and start EMLX on
112+
# any host (incl. Linux/CUDA), whose Metal/MLX NIF cannot load.
113+
# Apple Silicon users add {:emlx, "~> 0.3"} to their own
114+
# application's deps; the :emlx runtime profile then resolves to
115+
# the EMLX.Backend at runtime via Code.ensure_loaded?/1. See
116+
# guides/runtime_profiles.md.
92117
DependencySources.dep(:inference, __DIR__),
93118
DependencySources.dep(:agent_session_manager, __DIR__),
94119
DependencySources.dep(:gemini_cli_sdk, __DIR__),

0 commit comments

Comments
 (0)