-
Notifications
You must be signed in to change notification settings - Fork 186
Pull requests: NVIDIA-NeMo/Automodel
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
cp: Trigger Testing CICD
fix(gemma4): FSDP2-safe kv-sharing + skip frozen audio tower on grad-accum (2566) into r0.5.0
cherry-pick
Run CICD
#2599
opened Jun 16, 2026 by
svcnvidia-nemo-ci
Contributor
Loading…
fix(models): audit fp32 protected tensors
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2598
opened Jun 16, 2026 by
yuhezhang-ai
Contributor
•
Draft
cp: Trigger Testing CICD
fix(peft): LoRA MLP QLoRA/PP/gemma3n fixes (AM-435, AM-447, AM-453) (2584) into r0.5.0
cherry-pick
Run CICD
#2597
opened Jun 16, 2026 by
svcnvidia-nemo-ci
Contributor
Loading…
feat(retrieval): vl retrieval resolved dataset
#2596
opened Jun 16, 2026 by
yuhezhang-ai
Contributor
•
Draft
fix(qwen3_5_moe): convert MTP experts as grouped tensors (AM-442)
#2595
opened Jun 16, 2026 by
HuiyingLi
Contributor
Loading…
fix(transformers): keep gemma3n KV sharing working under FSDP2 (AM-454)
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2594
opened Jun 16, 2026 by
HuiyingLi
Contributor
Loading…
feat(gemma4): context parallelism for dense 31B
#2592
opened Jun 16, 2026 by
HuiyingLi
Contributor
Loading…
fix(moe): weight GroupedExpertsTE down-projection bias by routing probability
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2591
opened Jun 16, 2026 by
akoumpa
Contributor
Loading…
feat(deepseek-v4): support context parallel training
#2590
opened Jun 16, 2026 by
HuiyingLi
Contributor
Loading…
fix: use TE attention for gpt_oss packed-sequence recipe (AM-438)
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2587
opened Jun 16, 2026 by
akoumpa
Contributor
Loading…
feat(dflash): add dpace loss
community-request
#2572
opened Jun 15, 2026 by
kashif
Contributor
Loading…
feat(datasets): add cp=1 shared-prefix prefix-tree attention for rollouts
community-request
#2564
opened Jun 15, 2026 by
khazic
Contributor
Loading…
2 of 3 tasks
ci: Update transformers to latest version 5.12.0
#2555
opened Jun 14, 2026 by
svcnvidia-nemo-ci
Contributor
Loading…
feat(moe): mxfp4-resident MoE experts for DeepSeek-V4-Flash LoRA
community-request
waiting-on-customer
Waiting on the original author to respond
#2548
opened Jun 12, 2026 by
excepshenal
Loading…
3 tasks done
fix(retrieval): gate dummy vision forward
#2545
opened Jun 12, 2026 by
yuhezhang-ai
Contributor
Loading…
1 of 3 tasks
fix(wandb): log different val datasets separately in wandb
community-request
#2526
opened Jun 11, 2026 by
grgkovac
Contributor
Loading…
3 tasks done
ci: Update transformers to latest version 5.11.0
#2518
opened Jun 11, 2026 by
svcnvidia-nemo-ci
Contributor
Loading…
docs: MSC cloud checkpointing + expose multi-storage-client under [s3]
community-request
docs-only
With great power comes great responsibility.
waiting-on-customer
Waiting on the original author to respond
#2517
opened Jun 11, 2026 by
edjson
Contributor
Loading…
3 tasks done
feat(mimo_v25): support MiMo-V2.5-Pro
community-request
#2514
opened Jun 10, 2026 by
Simar-malhotra09
Loading…
1 of 3 tasks
fix(moe): preserve fp32 A_log in Qwen3.5-MoE and Qwen3-Next GatedDeltaNet
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2484
opened Jun 10, 2026 by
yuhezhang-ai
Contributor
Loading…
3 tasks done
fix: resolve nightly CI failures (FP8, ckpt, gemma3n, benchmark)
r0.5.0
Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge.
#2482
opened Jun 9, 2026 by
akoumpa
Contributor
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.