You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(server): expose qwen pre-norm hidden for MTP handoff
Promote a default-off slice from the conflicted Luce-Org#153/Luce-Org#154 native MTP stack. The Qwen35 graph can now optionally mark and return the final hidden state before output norm for future MTP handoff work while leaving default runtime behavior unchanged.\n\nRefresh docs/auto-integration.md with the latest PR containment, conflict probes, Codex delegation outcome, and validation notes.
Current integration source tip before this refresh: `35f29582`
11
11
12
12
This branch is maintained as a reproducible patch stack over `origin/main`. This unattended run started from a clean primary checkout on `auto-integration`, verified GitHub/Claude/Codex auth using the real user credential home, fetched `origin` and `easel` separately, fetched current non-draft PR heads, and checked exact PR-head containment against the stack tip.
13
13
14
-
The current stack contains 29 exact current open non-draft PR heads plus draft #329, which was already integrated before it became draft. No open non-draft PR head advanced since the prior pushed refresh. Six current non-draft PRs remain non-ancestor/selective-port candidates: #305, #237, #221, #154, #153, and #135. Fresh direct-merge probes reconfirmed conflicts for all six remaining candidates. This run ran a tmux-driven Codex read-only pass for #237; it reconfirmed that the only tiny safe PR237 slice is `server/src/common/gguf_metadata.h`, which is already present in the current stack, while the Qwen-specific native MTP runtime remains coupled to current backend/loader/target-graph reconciliation and needs populated-dependency build plus CUDA runtime validation. Existing selective salvage still covers #305's `DFLASH_EXPERT_BUDGET_PCT`, Qwen35MoE gallocr/full-chunk FFN work, and PR305 persistent prefill `StepGraph` reuse slice; #237's common MTP helper scaffold; and #135's diagnostic/control-plane multi-request scheduler scaffolds plus cache-reset seed fix and committed-boundary bookkeeping. The remaining live runtime paths are blocked on broad current-layout reconciliation and runtime validation.
14
+
The current stack contains 29 exact current open non-draft PR heads plus draft #329, which was already integrated before it became draft. No open non-draft PR head advanced since the prior pushed refresh. Six current non-draft PRs remain non-ancestor/selective-port candidates: #305, #237, #221, #154, #153, and #135. Fresh direct-merge probes reconfirmed conflicts for all six remaining candidates. This run ran a tmux-driven Codex pass for the #153/#154 native MTP pair and promoted one default-off current-layout slice: Qwen35 graph inputs/outputs can now expose the final hidden state before output norm (`expose_pre_norm_hidden` / `pre_norm_hidden`) for future MTP handoff work, without enabling native MTP runtime behavior. Codex rejected the broader #153/#154 native MTP loader/graph/tests as old-layout and still coupled to current MoE/backend/CUDA validation. Existing selective salvage still covers #305's `DFLASH_EXPERT_BUDGET_PCT`, Qwen35MoE gallocr/full-chunk FFN work, and PR305 persistent prefill `StepGraph` reuse slice; #237's common MTP helper scaffold; #153/#154's pre-norm hidden exposure; and #135's diagnostic/control-plane multi-request scheduler scaffolds plus cache-reset seed fix and committed-boundary bookkeeping. The remaining live runtime paths are blocked on broad current-layout reconciliation and runtime validation.
15
15
16
16
## Included in the current stack
17
17
@@ -54,6 +54,12 @@ Closed, upstreamed, or no-longer-open PRs still represented by the stack/base in
54
54
55
55
This run performed (latest first):
56
56
57
+
-`date -Is` -> `2026-06-01T13:45:21-04:00` / `2026-06-01T13:54:22-04:00` during this refresh; primary checkout was clean on `auto-integration`, auth/tooling checks succeeded using the real user credential home (`gh auth status`, `claude auth status --text`, and `codex --version`), and `origin` / `easel` were fetched separately. Current refs were `origin/main``8305b6c2`, `easel/auto-integration``35f29582`, and source tip `35f29582`; `origin/main` was already represented.
58
+
- Open PR enumeration reported 35 non-draft PRs and 5 draft/excluded PRs (#329 remains draft after earlier integration). Exact-head containment after explicit PR ref fetch showed 29 current open non-draft PR heads included; remaining non-ancestor/selective-port candidates remain #305, #237, #221, #154, #153, and #135.
59
+
- Fresh worktree direct-merge probes were run under `/tmp/luce-auto-cron-20260601-134521/`. Conflict counts remain #305 (61 status / 38 unmerged), #237 (33 / 27), #221 (88 / 25), #154 (13 / 12), #153 (10 / 10), and #135 (3 / 3).
60
+
- Tmux-driven Codex session `luce1345-pr153154-codex` in `/tmp/luce-auto-cron-20260601-134521/probe-pr-154` completed with report `/tmp/luce-codex-pr153154-20260601-134521.txt` and `VERDICT: SAFE_SLICE` for a default-off #153/#154 pre-norm hidden handoff scaffold. The promoted current-layout slice adds `QwenGraphInputs::expose_pre_norm_hidden`, `QwenGraphOutputs::pre_norm_hidden`, and marks/returns `inpL` before `out_norm` when explicitly requested. Codex rejected the broader native MTP loader/graph/test port because it is old-layout (`dflash/`/`dflash27b`), collides with current MoE/backend fields and CMake wiring, and still needs populated-dependency CUDA runtime validation.
61
+
- Validation for this source/manifest refresh: `git diff --check` passed and targeted conflict-marker search in changed files found none. Full CMake validation was not rerun because this checkout still lacks populated `server/deps/llama.cpp` plus the known local CUDA compiler-id `sm_52``ptxas` failure before project compilation.
62
+
57
63
-`date -Is` -> `2026-06-01T13:25:36-04:00` / `2026-06-01T13:30:51-04:00` during this refresh; primary checkout was clean on `auto-integration`, auth/tooling checks succeeded using the real user credential home (`gh auth status`, `claude auth status --text`, and `codex --version`), and `origin` / `easel` were fetched separately. Current refs were `origin/main``8305b6c2`, `easel/auto-integration``e221024b`, and source tip `e221024b`; `origin/main` was already represented.
58
64
- Open PR enumeration reported 35 non-draft PRs and 5 draft/excluded PRs (#329 remains draft after earlier integration). Exact-head containment after explicit PR ref fetch showed 29 current open non-draft PR heads included; remaining non-ancestor/selective-port candidates remain #305, #237, #221, #154, #153, and #135.
59
65
- Fresh worktree direct-merge probes were run under `/tmp/luce-auto-cron-20260601-132536/`. Conflict counts remain #305 (61 status / 38 unmerged), #237 (33 / 27), #221 (88 / 25), #154 (13 / 12), #153 (10 / 10), and #135 (3 / 3).
0 commit comments