Skip to content

Conversation

@smahdavi4
Copy link

@smahdavi4 smahdavi4 commented Dec 9, 2025

What does this PR do ?

Currently Megatron only accepts float/int for grad norm. To disable grad norm, Dtensor needs None while megatron needs zero. Adding zero to dtensor as well to allow for a consistent grad norm clipping usage.

Summary by CodeRabbit

  • Bug Fixes
    • Improved gradient clipping validation to correctly handle edge cases when the maximum gradient norm is configured to zero or negative values, preventing unintended clipping behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Sadegh Mahdavi <smahdavi@nvidia.com>
@smahdavi4 smahdavi4 requested review from a team as code owners December 9, 2025 20:37
@github-actions
Copy link

github-actions bot commented Dec 9, 2025

ℹ️ File Consistency Check

Check based on commit: 8c1ab0a (PR #1618 from allow-zero-grad-norm)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

  • nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.


This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 9, 2025

📝 Walkthrough

Walkthrough

Two policy worker files are updated to add an additional guard condition to gradient clipping logic in the train() method, requiring max_grad_norm to be positive (greater than 0) in addition to being non-None.

Changes

Cohort / File(s) Summary
Gradient clipping guard condition tightened
nemo_rl/models/policy/workers/dtensor_policy_worker.py, nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
Modified the condition for gradient clipping from if max_grad_norm is not None to if max_grad_norm is not None and max_grad_norm > 0 to prevent clipping when max_grad_norm is zero or negative

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR modifies gradient clipping behavior but lacks test results or validation demonstrating no convergence regression. Add test results or convergence comparison data to PR description showing the change does not cause training regressions.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title clearly and specifically describes the main change: allowing zero grad norm in dtensor policies for consistency with Megatron. It directly matches the core objective of the pull request.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@terrykong terrykong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joyang-nv to review

@terrykong terrykong requested a review from joyang-nv December 9, 2025 20:41
@terrykong terrykong added the CI:L1 Run doctests, unit tests, and functional tests label Dec 9, 2025
@terrykong terrykong changed the title Allow zero grad norm for consistency with Megatron fix: allow zero grad norm for consistency with Megatron Dec 9, 2025
@terrykong terrykong changed the title fix: allow zero grad norm for consistency with Megatron fix: allow zero grad norm in dtensor policies for consistency with Megatron Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L1 Run doctests, unit tests, and functional tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants