Skip to content

Fix multi model ddp unused params#21768

Closed
Zish19 wants to merge 29 commits into
Lightning-AI:masterfrom
Zish19:fix-multi-model-ddp-unused-params
Closed

Fix multi model ddp unused params#21768
Zish19 wants to merge 29 commits into
Lightning-AI:masterfrom
Zish19:fix-multi-model-ddp-unused-params

Conversation

@Zish19

@Zish19 Zish19 commented Jun 19, 2026

Copy link
Copy Markdown

I investigated issue #21548 and implemented a fix for unused parameter detection in MultiModelDDPStrategy.

Root cause:
MultiModelDDPStrategy wraps child modules in DistributedDataParallel, but returns the root LightningModule, so inherited DDPStrategy.pre_backward() exits early because self.model is not a DistributedDataParallel instance. As a result, prepare_for_backward() is never called and unused parameter detection is skipped.

Fix:
Override MultiModelDDPStrategy.pre_backward() and iterate over DDP-wrapped child modules. Only arm reducers for modules with active trainable parameters (requires_grad=True) to avoid false-positive reducer errors during multi-optimizer manual optimization (for example GAN training with toggle_optimizer()).

I also enabled a regression test for unused parameter detection in:
tests/tests_pytorch/strategies/test_multi_model_ddp.py

Branch:
Zish19:fix-multi-model-ddp-unused-params

Validation:

  • test_multi_model_ddp.py
  • test_ddp_integration.py

Both passed locally.

samsara-ku and others added 29 commits June 25, 2025 11:26
…-multimodelddp-implementation

test: cover MultiModelDDPStrategy
Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com>
…evious MultiModelDDP model example w/ new pl version
1. change all testcases snippets in right way; tested with one-by-one
@Zish19 Zish19 requested a review from tchaton as a code owner June 19, 2026 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants