Fix WeightAveraging swapping in the un-updated average model during validation by ATOM00blue · Pull Request #21732 · Lightning-AI/pytorch-lightning

ATOM00blue · 2026-05-22T02:47:51Z

What does this PR do?

WeightAveraging (and its EMAWeightAveraging subclass) creates the AveragedModel in setup() as a copy of the model's initial weights, with n_averaged == 0. The validation hooks on_validation_epoch_start / on_validation_epoch_end swapped this average model in unconditionally whenever it existed.

When using a delayed start (e.g. EMAWeightAveraging(update_starting_at_step=1000)) and validating during the warmup period, the average model has never been updated, so validation ran against the frozen initial (untrained) weights instead of the current trained ones. This is what the issue describes as metrics being near zero before update_starting_at_step.

This PR only swaps the models for validation once the average model has actually been updated at least once (n_averaged > 0). The swap remains balanced across the start/end hooks because n_averaged does not change during validation.

Tests

Added test_weight_averaging_no_swap_before_first_update, which verifies that during a never-reached delayed start the parameters seen at validation are the current trained weights, not the frozen initial snapshot. It fails before this change and passes after.
Updated SWATestCallback swap-count expectations: with a delayed update schedule, validation now only swaps once the average model has been updated.

Before submitting

Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
Did you update the CHANGELOG?

PR review

Anyone in the community is welcome to review the PR.

…ation Before its first update, the AveragedModel only holds the copy of the initial weights made in setup(). The validation hooks swapped it in unconditionally, so during a delayed-start warmup (e.g. EMAWeightAveraging with update_starting_at_step) validation evaluated the untrained snapshot instead of the current trained weights. Only swap the models when the average model has been updated at least once (n_averaged > 0). The swap stays balanced across validation start/end since n_averaged does not change during validation.

for more information, see https://pre-commit.ci

Copilot

Pull request overview

Fixes a bug in the WeightAveraging / EMAWeightAveraging callbacks where validation could swap in the averaged model even before it had received its first update (n_averaged == 0), causing validation to run on an untrained initial-weight snapshot during delayed-start warmup.

Changes:

Guarded validation-time model swapping so it only happens after the averaged model has been updated at least once (n_averaged > 0).
Updated SWA test expectations to reflect that validation swapping now starts only after the first averaging update occurs.
Added a regression test ensuring validation observes current trained weights (not the frozen initial snapshot) when the averaging update threshold is never reached.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
`src/lightning/pytorch/callbacks/weight_averaging.py`	Prevents swapping in the averaged model for validation until `n_averaged > 0`, avoiding evaluation on an untrained snapshot.
`tests/tests_pytorch/callbacks/test_weight_averaging.py`	Adjusts swap-count expectations and adds a regression test for “no swap before first update” behavior.
`src/lightning/pytorch/CHANGELOG.md`	Documents the fix under the unreleased “Fixed” section.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Martovark · 2026-05-30T08:07:29Z

I think it's better to use self._latest_update_step > 0 from WeightAveraging instead of self._average_model.n_averaged > 0, because most likely self._average_model.n_averaged lives on cuda and it requires device <-> host sync.

…e-update

codecov-commenter · 2026-06-30T08:48:43Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79%. Comparing base (fe6b1cc) to head (1529b9b).
✅ All tests successful. No failed tests found.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

❗ There is a different number of reports uploaded between BASE (fe6b1cc) and HEAD (1529b9b). Click for more details.

HEAD has 7977 uploads less than BASE

Flag BASE (fe6b1cc) HEAD (1529b9b)

cpu 1819 42

python 132 3

lightning_fabric 580 0

pytest 908 0

python3.12 523 12

python3.13 380 9

lightning 652 15

python3.11 260 6

python3.12.7 393 9

python3.10 131 3

pytorch2.1 131 6

pytest-full 911 42

pytorch2.8 130 6

pytorch_lightning 587 27

pytorch2.6 66 3

pytorch2.7 65 3

pytorch2.2.2 63 3

pytorch2.4.1 65 3

pytorch2.10 131 6

pytorch2.5.1 66 3

pytorch2.9 129 6

pytorch2.3 65 3

Additional details and impacted files

@@            Coverage Diff            @@
##           master   #21732     +/-   ##
=========================================
- Coverage      87%      79%     -8%     
=========================================
  Files         270      267      -3     
  Lines       24005    23949     -56     
=========================================
- Hits        20776    18824   -1952     
- Misses       3229     5125   +1896

deependujha · 2026-06-30T08:56:17Z

Hi @Martovark, great catch. Verified it from:

https://github.com/pytorch/pytorch/blob/0d62256a2b23365f8e1604297eb23a6545102aa8/torch/optim/swa_utils.py#L283-L285

Fixed standalone tests; should be good to land. Thanks @ATOM00blue for the work.

ATOM00blue requested a review from tchaton as a code owner May 22, 2026 02:47

Copilot AI review requested due to automatic review settings May 22, 2026 02:47

ATOM00blue requested review from ethanwharris and justusschock as code owners May 22, 2026 02:47

[pre-commit.ci] auto fixes from pre-commit.com hooks

5ae8aaa

for more information, see https://pre-commit.ci

Copilot started reviewing on behalf of ATOM00blue May 22, 2026 02:48 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

deependujha added 4 commits June 30, 2026 13:37

update

ea5848d

update

ac5fdf4

update

3b71465

Merge branch 'master' into fix/weight-averaging-validation-swap-befor…

47f01c0

…e-update

fix standalone tests

e701321

deependujha approved these changes Jun 30, 2026

View reviewed changes

update

1529b9b

deependujha mentioned this pull request Jun 30, 2026

Fix WeightAveraging swapping un-averaged model during validation before first update #21754

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix WeightAveraging swapping in the un-updated average model during validation#21732

Fix WeightAveraging swapping in the un-updated average model during validation#21732
ATOM00blue wants to merge 8 commits into
Lightning-AI:masterfrom
ATOM00blue:fix/weight-averaging-validation-swap-before-update

ATOM00blue commented May 22, 2026 •

edited by deependujha

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Martovark commented May 30, 2026

Uh oh!

codecov-commenter commented Jun 30, 2026 •

edited

Loading

Uh oh!

deependujha commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

ATOM00blue commented May 22, 2026 • edited by deependujha Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Tests

PR review

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Martovark commented May 30, 2026

Uh oh!

codecov-commenter commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

deependujha commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ATOM00blue commented May 22, 2026 •

edited by deependujha

Loading

codecov-commenter commented Jun 30, 2026 •

edited

Loading