perf: Add DeepEP support to Megatron Policy #1645

parthmannan · 2025-12-17T00:07:05Z

What does this PR do ?

Adds DeepEP support to megatron policy for training MoE workloads. Disabled by default in the example configs so no change to current behavior.

Issues

List issues that this PR closes (syntax):

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

...

Summary by CodeRabbit

New Features
- Added three new Mixture of Experts (MoE) configuration parameters to provide enhanced control over expert routing and dispatch behavior, including settings for deep expert parallelism enablement, token dispatcher type selection, and shared expert overlap configuration.
Chores
- Added DeepEP dependency to support new Mixture of Experts functionality.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: Parth Mannan <pmannan@nvidia.com>

examples/configs/sft_openmathinstruct2_megatron.yaml

coderabbitai · 2025-12-17T00:13:20Z

📝 Walkthrough

Walkthrough

This PR adds support for three Megatron Mixture-of-Experts (MoE) configuration fields—moe_enable_deepep, moe_token_dispatcher_type, and moe_shared_expert_overlap—to example configurations and wires them into the MegatronPolicyWorker initialization. A Git-based DeepEP dependency is also added to pyproject.toml.

Changes

Cohort / File(s)	Summary
MoE Configuration Files `examples/configs/grpo_math_1B_megatron.yaml`, `examples/configs/sft_openmathinstruct2_megatron.yaml`	Added three new MoE configuration keys under megatron_cfg: `moe_enable_deepep` (false), `moe_token_dispatcher_type` ("allgather"), and `moe_shared_expert_overlap` (false) to both example config files.
Policy Worker Configuration Wiring `nemo_rl/models/policy/workers/megatron_policy_worker.py`	Added logic to read the three new MoE configuration fields from `megatron_cfg` and assign them to corresponding `model_cfg` fields during MegatronPolicyWorker initialization.
Dependency Management `pyproject.toml`	Added a Git-based dependency entry for DeepEP to the mcore dependency list.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Configuration additions are straightforward and repetitive across both YAML files
Policy worker wiring consists of simple assignment statements with no conditional logic
No behavioral changes or error-handling modifications

Possibly related PRs

fix: Fixes to make Megatron backend match dtensor #1389: Modifies MegatronPolicyWorker and Megatron config handling, including DDP/TP adjustments and tensor-parallel checks alongside MoE configuration management.

Suggested labels

CI:L1, r0.4.0

Suggested reviewers

terrykong

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Test Results For Major Changes	⚠️ Warning	PR introduces DeepEP feature for MoE training but lacks test results or testing information validating the implementation.	Add comprehensive test results and validation information confirming DeepEP functionality, convergence behavior, and performance metrics before merging.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'perf: Add DeepEP support to Megatron Policy' is specific and clearly describes the main change—adding DeepEP (a performance optimization) support to the Megatron policy component, which aligns with the changeset's additions of MoE configuration options and dependency.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52cebdf and 3ae3790.

📒 Files selected for processing (4)

examples/configs/grpo_math_1B_megatron.yaml (1 hunks)
examples/configs/sft_openmathinstruct2_megatron.yaml (1 hunks)
nemo_rl/models/policy/workers/megatron_policy_worker.py (1 hunks)
pyproject.toml (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

!(**/tests/**|**/test_*.py|**/test_*.sh)

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Add the NVIDIA copyright header to all Python files and shell scripts (excluding tests). The header should include the current year

Files:

examples/configs/grpo_math_1B_megatron.yaml
nemo_rl/models/policy/workers/megatron_policy_worker.py
examples/configs/sft_openmathinstruct2_megatron.yaml
pyproject.toml

**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Conform code to Python 3.12+
Indent code with 4 spaces. Do not use tabs
Use snake_case for file names
Use PascalCase for class names
Use snake_case for function and method names
Use snake_case for local variables
Prefix variable names that start with a number with 'k' (e.g., k_99th_percentile)
Use upper snake_case with 'G' prefix for global variables (e.g., G_MY_GLOBAL)
Use upper snake_case for constants
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
Prefer docstrings over comments for interfaces that may be used outside a file
Reserve comments for code within a function or interfaces that are local to a file
If a piece of code is commented out, include a comment describing its usage and why it's commented out. Remove debug comments before merging
Use Google style docstrings for classes and functions in Python, which can be parsed by Sphinx
Avoid using reflection when functionality can be easily achieved without reflection
When using try-except blocks, limit the except clause to the smallest set of specific errors possible
When using try-except blocks for duck-typing, keep the body of the try as small as possible and use the else block for logic
YAML is the single source of truth for configuration defaults. Do not set non-None defaults in code for configuration values
For required configuration attributes, access config directly and expect presence (e.g., policy_cfg['precision']) without hidden defaults
Use typing.NotRequired to mark optional attributes in TypedDict for configuration
When adding a new config key to a TypedDict subclass, document the key's purpose, valid values/types, and recommended default, and reflect the default in exemplar YAMLs under examples/configs/*.yaml
Follow the Google Python Style Guide for Python code

Files:

nemo_rl/models/policy/workers/megatron_policy_worker.py

nemo_rl/**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

For any source file under nemo_rl/*.py that defines a class or function decorated with @ray.remote, add a coverage pragma (# pragma: no cover) because these run in separate Ray processes

Files:

nemo_rl/models/policy/workers/megatron_policy_worker.py

**/*.{py,sh}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

The NVIDIA copyright header should appear at the top of all Python files and shell scripts (excluding tests)

Files:

nemo_rl/models/policy/workers/megatron_policy_worker.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: adil-a
Repo: NVIDIA-NeMo/RL PR: 1440
File: examples/configs/sft_automodel.yaml:48-58
Timestamp: 2025-10-30T20:50:44.126Z
Learning: In DTensor configurations for MoE (Mixture of Experts) models, expert_parallel_size and data_parallel_size can be applied together without multiplying the GPU requirements. Expert Parallelism (EP) only applies to MoE layers, while Data Parallelism/FSDP applies to non-MoE layers. Therefore, configurations like expert_parallel_size: 8 and data_parallel_size: 8 are valid on an 8-GPU cluster for MoE models.

📚 Learning: 2025-10-30T20:50:44.126Z

Learnt from: adil-a
Repo: NVIDIA-NeMo/RL PR: 1440
File: examples/configs/sft_automodel.yaml:48-58
Timestamp: 2025-10-30T20:50:44.126Z
Learning: In DTensor configurations for MoE (Mixture of Experts) models, expert_parallel_size and data_parallel_size can be applied together without multiplying the GPU requirements. Expert Parallelism (EP) only applies to MoE layers, while Data Parallelism/FSDP applies to non-MoE layers. Therefore, configurations like expert_parallel_size: 8 and data_parallel_size: 8 are valid on an 8-GPU cluster for MoE models.

Applied to files:

examples/configs/grpo_math_1B_megatron.yaml
examples/configs/sft_openmathinstruct2_megatron.yaml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Lint check
GitHub Check: Post submodule check comment / Comment on PR
GitHub Check: Post automodel integration comment / Comment on PR

🔇 Additional comments (2)

examples/configs/grpo_math_1B_megatron.yaml (1)

97-99: Consistent MoE configuration across example configs.

The MoE configuration keys are consistently defined with the same conservative defaults as in sft_openmathinstruct2_megatron.yaml. This ensures backward compatibility and predictable behavior across different algorithm configurations.

pyproject.toml (1)

101-101: DeepEP dependency configuration is valid.

The specified commit (e3908bf5bd0cc6265bcb225d15cd8c996d4759ef) exists in the DeepEP repository and is accessible as a stable reference point for production use.

coderabbitai · 2025-12-17T00:13:23Z

examples/configs/sft_openmathinstruct2_megatron.yaml

+    moe_enable_deepep: false
+    moe_token_dispatcher_type: "allgather"
+    moe_shared_expert_overlap: false


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for TypedDict definitions and MoE-related config keys echo "=== Searching for MegatronConfig TypedDict definition ===" find . -name "*.py" -type f ! -path "./tests/*" ! -path "./.git/*" | xargs grep -l "class MegatronConfig" | head -5 echo -e "\n=== Searching for MoE config keys in Python files ===" rg -n "(moe_enable_deepep|moe_token_dispatcher_type|moe_shared_expert_overlap)" --type=py | head -20 echo -e "\n=== Searching for MoE-related TypedDict or config definitions ===" rg -n -B2 -A10 "moe_enable_deepep|moe_token_dispatcher_type" --type=py | head -40

Repository: NVIDIA-NeMo/RL

Length of output: 2695

🏁 Script executed:

#!/bin/bash # Examine the PolicyConfig TypedDict definition in the policy module echo "=== Reading PolicyConfig TypedDict definition ===" cat nemo_rl/models/policy/__init__.py | head -200 echo -e "\n=== Searching for MegatronConfig definition ===" rg -n -A30 "class MegatronConfig.*TypedDict" --type=py

Repository: NVIDIA-NeMo/RL

Length of output: 9554

🏁 Script executed:

#!/bin/bash # Check the full MegatronConfig TypedDict to see if we missed the MoE keys echo "=== Reading full MegatronConfig definition ===" rg -n "class MegatronConfig" -A 40 --type=py nemo_rl/models/policy/__init__.py

Repository: NVIDIA-NeMo/RL

Length of output: 2137

🏁 Script executed:

#!/bin/bash # Final verification - check if these keys are defined anywhere in TypedDict context echo "=== Checking if MoE keys appear in any TypedDict definitions ===" rg -n "moe_enable_deepep|moe_token_dispatcher_type|moe_shared_expert_overlap" --type=py -B3 -A3 echo -e "\n=== Check the actual YAML file to see the context ===" head -100 examples/configs/sft_openmathinstruct2_megatron.yaml | tail -20

Repository: NVIDIA-NeMo/RL

Length of output: 2417

Add MoE configuration keys to MegatronConfig TypedDict definition.

The keys moe_enable_deepep, moe_token_dispatcher_type, and moe_shared_expert_overlap are used in the YAML exemplar and referenced in megatron_policy_worker.py (lines 594-601), but they are missing from the MegatronConfig TypedDict definition in nemo_rl/models/policy/__init__.py.

Add these keys to the TypedDict with:

Type annotations

NotRequired marker (if optional)

Inline documentation explaining their purpose and valid values

For reference: moe_token_dispatcher_type accepts values like "allgather", "alltoall", or "flex"; moe_enable_deepep is a boolean; moe_shared_expert_overlap is a boolean.

🤖 Prompt for AI Agents

In nemo_rl/models/policy/__init__.py around the MegatronConfig TypedDict definition (add near existing config keys), the TypedDict is missing the MoE keys referenced in examples and megatron_policy_worker.py; add the following entries to MegatronConfig: moe_enable_deepep: NotRequired[bool] with inline docstring "Enable DeepEP optimization for MoE (true/false)"; moe_token_dispatcher_type: NotRequired[str] with inline docstring "Token dispatcher type for MoE; valid values: 'allgather', 'alltoall', or 'flex'"; moe_shared_expert_overlap: NotRequired[bool] with inline docstring "Allow shared expert overlap across MoE experts (true/false)". Ensure you import NotRequired from typing_extensions (or typing if available) and keep styling consistent with neighboring keys.

coderabbitai · 2025-12-17T00:13:24Z

nemo_rl/models/policy/workers/megatron_policy_worker.py

+        model_cfg.moe_enable_deepep = self.cfg["megatron_cfg"][
+            "moe_enable_deepep"
+        ]
+        model_cfg.moe_token_dispatcher_type = self.cfg["megatron_cfg"][
+            "moe_token_dispatcher_type"
+        ]
+        model_cfg.moe_shared_expert_overlap = self.cfg["megatron_cfg"][
+            "moe_shared_expert_overlap"
+        ]


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find the megatron_policy_worker.py file and check the context around lines 594-602 find . -name "megatron_policy_worker.py" -type f # Also search for config-related files and TypedDict definitions find . -path "*/nemo_rl/*" -name "*config*" -type f | head -20 # Search for example YAML configs find . -path "*/examples/configs/*" -name "*.yaml" -o -name "*.yml" | head -20

Repository: NVIDIA-NeMo/RL

Length of output: 979

🏁 Script executed:

# Search for references to moe_enable_deepep, moe_token_dispatcher_type, moe_shared_expert_overlap grep -r "moe_enable_deepep" . --include="*.py" --include="*.yaml" --include="*.yml" 2>/dev/null grep -r "moe_token_dispatcher_type" . --include="*.py" --include="*.yaml" --include="*.yml" 2>/dev/null grep -r "moe_shared_expert_overlap" . --include="*.py" --include="*.yaml" --include="*.yml" 2>/dev/null

Repository: NVIDIA-NeMo/RL

Length of output: 1246

🏁 Script executed:

# Find MegatronConfig TypedDict definition find . -path "*/nemo_rl/*" -name "*.py" -type f | xargs grep -l "class MegatronConfig\|MegatronConfig.*TypedDict" | head -5 # Also check the specific file to see surrounding code head -700 ./nemo_rl/models/policy/workers/megatron_policy_worker.py | tail -150

Repository: NVIDIA-NeMo/RL

Length of output: 7620

🏁 Script executed:

# Find the MegatronConfig TypedDict definition grep -r "class MegatronConfig" ./nemo_rl --include="*.py" -A 50 | head -100 # Also check the policy __init__.py which might import/define it grep -r "moe_enable_deepep\|moe_token_dispatcher_type\|moe_shared_expert_overlap" ./nemo_rl/models/policy/__init__.py -B 5 -A 5

Repository: NVIDIA-NeMo/RL

Length of output: 4121

🏁 Script executed:

# Search for the moe-related keys in the TypedDict definition grep -r "moe_enable_deepep\|moe_token_dispatcher_type\|moe_shared_expert_overlap\|moe_router_dtype\|moe_permute_fusion" ./nemo_rl/models/policy/__init__.py -B 2 -A 2 # Get more context from the MegatronConfig class definition sed -n '/^class MegatronConfig/,/^class [A-Z]/p' ./nemo_rl/models/policy/__init__.py | head -150

Repository: NVIDIA-NeMo/RL

Length of output: 161

🏁 Script executed:

# Get the full MegatronConfig TypedDict definition - reading the file more carefully cat ./nemo_rl/models/policy/__init__.py | grep -A 200 "^class MegatronConfig(TypedDict):" | grep -B 5 "^class " | head -200

Repository: NVIDIA-NeMo/RL

Length of output: 1041

🏁 Script executed:

# Read the file directly and extract MegatronConfig definition python3 << 'EOF' with open('./nemo_rl/models/policy/__init__.py', 'r') as f: content = f.read() # Find MegatronConfig class import re match = re.search(r'class MegatronConfig$TypedDict$:(.*?)(?=\nclass \w+)', content, re.DOTALL) if match: moe_config = match.group(0) # Print lines containing moe for line in moe_config.split('\n'): if 'moe' in line.lower() or line.strip().startswith(('moe_', 'freeze_', 'optimizer:', 'scheduler:', 'distributed')): print(line) EOF

Repository: NVIDIA-NeMo/RL

Length of output: 251

🏁 Script executed:

# Read the entire MegatronConfig class definition with line numbers python3 << 'EOF' import re with open('./nemo_rl/models/policy/__init__.py', 'r') as f: lines = f.readlines() in_megatron_config = False class_indent = 0 for i, line in enumerate(lines, 1): if 'class MegatronConfig(TypedDict):' in line: in_megatron_config = True class_indent = len(line) - len(line.lstrip()) print(f"{i}: {line}", end='') elif in_megatron_config: current_indent = len(line) - len(line.lstrip()) # Stop when we hit a new class at the same or lower indent level if line.strip().startswith('class ') and current_indent <= class_indent: break print(f"{i}: {line}", end='') EOF

Repository: NVIDIA-NeMo/RL

Length of output: 1676

🏁 Script executed:

# Search for where these moe keys are defined in TypedDict python3 << 'EOF' import re with open('./nemo_rl/models/policy/__init__.py', 'r') as f: content = f.read() # Look for all TypedDict definitions that might have these keys for match in re.finditer(r'(class \w+(?:Config)?$TypedDict$:.*?)(?=\nclass |\Z)', content, re.DOTALL): class_def = match.group(0) if any(key in class_def for key in ['moe_enable_deepep', 'moe_token_dispatcher_type', 'moe_shared_expert_overlap', 'moe_router_dtype', 'moe_permute_fusion']): print(class_def[:500]) print("\n" + "="*80 + "\n") EOF # If not found, let's see if they're defined elsewhere grep -r "moe_enable_deepep" ./nemo_rl --include="*.py" -B 3 -A 1 | head -30

Repository: NVIDIA-NeMo/RL

Length of output: 637

🏁 Script executed:

# Let's check if these keys exist anywhere in the config definition grep -r "moe_router_dtype\|moe_permute_fusion\|moe_router_load_balancing_type\|moe_router_bias_update_rate" ./nemo_rl/models/policy/__init__.py # Let's see the full context around line 146 and beyond sed -n '120,200p' ./nemo_rl/models/policy/__init__.py

Repository: NVIDIA-NeMo/RL

Length of output: 3372

🏁 Script executed:

# Check if there are any other files that extend or add to MegatronConfig grep -r "moe_enable_deepep\|moe_token_dispatcher_type\|moe_shared_expert_overlap" ./nemo_rl --include="*.py" | grep -i "typeddict\|typing" # Check the coding guidelines mentioned in the learning cat CODING_GUIDELINES.md 2>/dev/null | grep -A 5 "TypedDict\|config key" # Count all the megatron example configs find ./examples/configs -name "*megatron*" -type f # Verify all three keys are in every megatron example config for file in $(find ./examples/configs -name "*megatron*" -type f); do echo "=== $file ===" grep -c "moe_enable_deepep\|moe_token_dispatcher_type\|moe_shared_expert_overlap" "$file" || echo "MISSING KEYS" done

Repository: NVIDIA-NeMo/RL

Length of output: 5715

Add moe_enable_deepep, moe_token_dispatcher_type, and moe_shared_expert_overlap to MegatronConfig TypedDict and include in all exemplar configs.

These three MoE configuration keys are accessed directly without fallbacks (lines 594-602), but they are missing from the TypedDict definition and absent from 30 of 32 megatron example configs. Per coding guidelines, new config keys must be added to the TypedDict and reflected in exemplar YAMLs under examples/configs/.yaml. Either add these as required attributes to MegatronConfig and ensure they appear in all exemplar configs (grpo_math_B_megatron.yaml, sft_.yaml, vlm_grpo_.yaml, etc.), or mark them as NotRequired and provide appropriate defaults.

🤖 Prompt for AI Agents

In nemo_rl/models/policy/workers/megatron_policy_worker.py around lines 594-602, the code reads three MoE keys directly (moe_enable_deepep, moe_token_dispatcher_type, moe_shared_expert_overlap) but these keys are not declared in the MegatronConfig TypedDict and are missing from most example YAMLs; add these keys to the MegatronConfig TypedDict (either as required attributes or as NotRequired) and update all exemplar configs under examples/configs/*.yaml so each file includes these keys (if NotRequired, give sensible defaults when reading the config or in the TypedDict comments; if required, add explicit values in every exemplar). Ensure the defaults/types match usage in megatron_policy_worker (boolean for moe_enable_deepep, string for moe_token_dispatcher_type, numeric/integer for moe_shared_expert_overlap) and run config validation/tests to confirm no missing-key errors.

…/deepep_mcore_training

guyueh1

LGTM

terrykong · 2025-12-20T06:49:03Z

pyproject.toml

  # https://github.com/NVIDIA/TransformerEngine/blob/v2.3/transformer_engine/pytorch/attention/dot_product_attention/utils.py#L108
  # https://github.com/facebookresearch/xformers/blob/8354497deb2c04c67fbb2e2ad911e86530da0e90/xformers/ops/fmha/flash.py#L76
  "flash-attn==2.8.1",
+  "deep_ep @ git+https://github.com/deepseek-ai/DeepEP.git@e3908bf5bd0cc6265bcb225d15cd8c996d4759ef",


for v0.5 we are waiting on https://github.com/NVIDIA-NeMo/RL/pull/1470/changes#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711

so that PR will change to a diff deep ep version. can you verify that this new one works? note there are two changes you need to make:

Sounds good. I can test with newer deep_ep and push changes once upstream is updated

@terrykong Upgraded the DeepEP variant but the latest run seems to have removed a bunch of dependencies from uv.lock. Is that expected?

I think it is fine to let uv do whatever with uv.lock

…/deepep_mcore_training

guyueh1 · 2025-12-30T19:30:44Z

@parthmannan please fix lint and then attach CI:L2 label again to kick off tests

…/deepep_mcore_training

Add DeepEP support to Megatron Policy

3ae3790

Signed-off-by: Parth Mannan <pmannan@nvidia.com>

parthmannan requested review from a team as code owners December 17, 2025 00:07

parthmannan requested a review from guyueh1 December 17, 2025 00:08

terrykong reviewed Dec 17, 2025

View reviewed changes

examples/configs/sft_openmathinstruct2_megatron.yaml Show resolved Hide resolved

coderabbitai bot reviewed Dec 17, 2025

View reviewed changes

parthmannan added 3 commits December 16, 2025 22:59

Add example recipe and add to config dict

1f263e4

Merge branch 'main' of https://github.com/NVIDIA-NeMo/RL into pmannan…

ee4e55c

…/deepep_mcore_training

Recipe update

a7f3d1c

guyueh1 previously approved these changes Dec 20, 2025

View reviewed changes

guyueh1 added the CI:L2 Run doctests, unit tests, functional tests, and convergence tests label Dec 20, 2025

guyueh1 temporarily deployed to nemo-ci December 20, 2025 06:07 — with GitHub Actions Inactive

guyueh1 changed the title ~~Add DeepEP support to Megatron Policy~~ perf: Add DeepEP support to Megatron Policy Dec 20, 2025

guyueh1 temporarily deployed to nemo-ci December 20, 2025 06:44 — with GitHub Actions Inactive

terrykong reviewed Dec 20, 2025

View reviewed changes

parthmannan added 2 commits December 29, 2025 08:51

Merge branch 'main' of https://github.com/NVIDIA-NeMo/RL into pmannan…

f7f3d83

…/deepep_mcore_training

Upgrade DeepEP version to match

1c8e38d

parthmannan dismissed guyueh1’s stale review via 1c8e38d December 30, 2025 13:59

guyueh1 removed the CI:L2 Run doctests, unit tests, functional tests, and convergence tests label Dec 30, 2025

Lint fix

20355a6

parthmannan added the CI:L2 Run doctests, unit tests, functional tests, and convergence tests label Jan 4, 2026

parthmannan had a problem deploying to nemo-ci January 4, 2026 17:03 — with GitHub Actions Failure

Merge branch 'main' of https://github.com/NVIDIA-NeMo/RL into pmannan…

d913d54

…/deepep_mcore_training

guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 5, 2026

guyueh1 had a problem deploying to nemo-ci January 5, 2026 04:38 — with GitHub Actions Failure

guyueh1 approved these changes Jan 6, 2026

View reviewed changes

guyueh1 added CI:L2 Run doctests, unit tests, functional tests, and convergence tests and removed CI:L2 Run doctests, unit tests, functional tests, and convergence tests labels Jan 6, 2026

guyueh1 had a problem deploying to nemo-ci January 6, 2026 18:13 — with GitHub Actions Failure

perf: Add DeepEP support to Megatron Policy #1645

Are you sure you want to change the base?

perf: Add DeepEP support to Megatron Policy #1645

Conversation

parthmannan commented Dec 17, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

Uh oh!

coderabbitai bot commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

guyueh1 left a comment

Choose a reason for hiding this comment

Uh oh!

terrykong Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

parthmannan Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

parthmannan Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

guyueh1 Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

guyueh1 commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

parthmannan commented Dec 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 17, 2025 •

edited

Loading

guyueh1 commented Dec 30, 2025 •

edited

Loading