Skip to content

Pull requests: NVIDIA-NeMo/Automodel

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix(models): audit fp32 protected tensors r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2598 opened Jun 16, 2026 by yuhezhang-ai Contributor Draft
feat(retrieval): vl retrieval resolved dataset
#2596 opened Jun 16, 2026 by yuhezhang-ai Contributor Draft
fix(qwen3_5_moe): convert MTP experts as grouped tensors (AM-442)
#2595 opened Jun 16, 2026 by HuiyingLi Contributor Loading…
fix(transformers): keep gemma3n KV sharing working under FSDP2 (AM-454) r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2594 opened Jun 16, 2026 by HuiyingLi Contributor Loading…
feat(gemma4): context parallelism for dense 31B
#2592 opened Jun 16, 2026 by HuiyingLi Contributor Loading…
fix(moe): weight GroupedExpertsTE down-projection bias by routing probability r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2591 opened Jun 16, 2026 by akoumpa Contributor Loading…
feat(deepseek-v4): support context parallel training
#2590 opened Jun 16, 2026 by HuiyingLi Contributor Loading…
fix: use TE attention for gpt_oss packed-sequence recipe (AM-438) r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2587 opened Jun 16, 2026 by akoumpa Contributor Loading…
feat(dflash): add dpace loss community-request
#2572 opened Jun 15, 2026 by kashif Contributor Loading…
chore: CI proxy for #2564
#2565 opened Jun 15, 2026 by HuiyingLi Contributor Loading…
feat(engine): Engine training API
#2556 opened Jun 14, 2026 by HuiyingLi Contributor Draft
ci: Update transformers to latest version 5.12.0
#2555 opened Jun 14, 2026 by svcnvidia-nemo-ci Contributor Loading…
feat: CP support for MiniMax M3
#2551 opened Jun 13, 2026 by athitten Contributor Draft
3 tasks
feat(moe): mxfp4-resident MoE experts for DeepSeek-V4-Flash LoRA community-request waiting-on-customer Waiting on the original author to respond
#2548 opened Jun 12, 2026 by excepshenal Loading…
3 tasks done
fix(retrieval): gate dummy vision forward
#2545 opened Jun 12, 2026 by yuhezhang-ai Contributor Loading…
1 of 3 tasks
fix(wandb): log different val datasets separately in wandb community-request
#2526 opened Jun 11, 2026 by grgkovac Contributor Loading…
3 tasks done
ci: Update transformers to latest version 5.11.0
#2518 opened Jun 11, 2026 by svcnvidia-nemo-ci Contributor Loading…
docs: MSC cloud checkpointing + expose multi-storage-client under [s3] community-request docs-only With great power comes great responsibility. waiting-on-customer Waiting on the original author to respond
#2517 opened Jun 11, 2026 by edjson Contributor Loading…
3 tasks done
feat(mimo_v25): support MiMo-V2.5-Pro community-request
#2514 opened Jun 10, 2026 by Simar-malhotra09 Loading…
1 of 3 tasks
fix(moe): preserve fp32 A_log in Qwen3.5-MoE and Qwen3-Next GatedDeltaNet r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2484 opened Jun 10, 2026 by yuhezhang-ai Contributor Loading…
3 tasks done
fix: resolve nightly CI failures (FP8, ckpt, gemma3n, benchmark) r0.5.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2482 opened Jun 9, 2026 by akoumpa Contributor Loading…
ProTip! no:milestone will show everything without a milestone.