fix: allow zero grad norm in dtensor policies for consistency with Megatron #1618

smahdavi4 · 2025-12-09T20:37:28Z

What does this PR do ?

Currently Megatron only accepts float/int for grad norm. To disable grad norm, Dtensor needs None while megatron needs zero. Adding zero to dtensor as well to allow for a consistent grad norm clipping usage.

Summary by CodeRabbit

Bug Fixes
- Improved gradient clipping validation to correctly handle edge cases when the maximum gradient norm is configured to zero or negative values, preventing unintended clipping behavior.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: Sadegh Mahdavi <smahdavi@nvidia.com>

github-actions · 2025-12-09T20:38:00Z

ℹ️ File Consistency Check

Check based on commit: 8c1ab0a (PR #1618 from allow-zero-grad-norm)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

nemo_rl/models/policy/workers/dtensor_policy_worker.py
nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.

_{This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.}

coderabbitai · 2025-12-09T20:40:13Z

📝 Walkthrough

Walkthrough

Two policy worker files are updated to add an additional guard condition to gradient clipping logic in the train() method, requiring max_grad_norm to be positive (greater than 0) in addition to being non-None.

Changes

Cohort / File(s)	Summary
Gradient clipping guard condition tightened `nemo_rl/models/policy/workers/dtensor_policy_worker.py`, `nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py`	Modified the condition for gradient clipping from `if max_grad_norm is not None` to `if max_grad_norm is not None and max_grad_norm > 0` to prevent clipping when max_grad_norm is zero or negative

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Test Results For Major Changes	⚠️ Warning	PR modifies gradient clipping behavior but lacks test results or validation demonstrating no convergence regression.	Add test results or convergence comparison data to PR description showing the change does not cause training regressions.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check	✅ Passed	The title clearly and specifically describes the main change: allowing zero grad norm in dtensor policies for consistency with Megatron. It directly matches the core objective of the pull request.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

terrykong

@joyang-nv to review

allow zero grad norm for consistency with megatron

8c1ab0a

Signed-off-by: Sadegh Mahdavi <smahdavi@nvidia.com>

smahdavi4 requested review from a team as code owners December 9, 2025 20:37

terrykong reviewed Dec 9, 2025

View reviewed changes

terrykong requested a review from joyang-nv December 9, 2025 20:41

terrykong added the CI:L1 Run doctests, unit tests, and functional tests label Dec 9, 2025

terrykong temporarily deployed to nemo-ci December 9, 2025 20:41 — with GitHub Actions Inactive

terrykong changed the title ~~Allow zero grad norm for consistency with Megatron~~ fix: allow zero grad norm for consistency with Megatron Dec 9, 2025

terrykong changed the title ~~fix: allow zero grad norm for consistency with Megatron~~ fix: allow zero grad norm in dtensor policies for consistency with Megatron Dec 9, 2025

terrykong temporarily deployed to nemo-ci December 9, 2025 20:45 — with GitHub Actions Inactive

terrykong temporarily deployed to nemo-ci December 10, 2025 00:17 — with GitHub Actions Inactive

smahdavi4 mentioned this pull request Dec 10, 2025

Update nemo-rl to latest NVIDIA-NeMo/Skills#1087

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: allow zero grad norm in dtensor policies for consistency with Megatron #1618

fix: allow zero grad norm in dtensor policies for consistency with Megatron #1618

Uh oh!

smahdavi4 commented Dec 9, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Dec 9, 2025

Uh oh!

coderabbitai bot commented Dec 9, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

terrykong left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: allow zero grad norm in dtensor policies for consistency with Megatron #1618

Are you sure you want to change the base?

fix: allow zero grad norm in dtensor policies for consistency with Megatron #1618

Uh oh!

Conversation

smahdavi4 commented Dec 9, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Summary by CodeRabbit

Uh oh!

github-actions bot commented Dec 9, 2025

ℹ️ File Consistency Check

✅ DTensor Policy Worker Synchronization Check

Uh oh!

coderabbitai bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

terrykong left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

smahdavi4 commented Dec 9, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 9, 2025 •

edited

Loading

terrykong left a comment •

edited

Loading