Skip to content

Conversation

@RayenTian
Copy link
Contributor

@RayenTian RayenTian commented Dec 28, 2025

What does this PR do ?

This PR consolidates three incremental changes:

  • Makes the GRPO workflow runnable end-to-end.
  • Aligns LoRA parameter-name mapping with vLLM’s LoRA manager to ensure adapter compatibility.
  • Adds async and non-colocated lora support.

Issues

List issues that this PR closes (syntax):
close #1597

Usage

  • You can potentially add a usage example below
bash tests/test_suites/llm/grpo-qwen3-8B-base-1n8g-fsdp2-lora.sh

Result

Co-located + Sync

Qwen/Qwen3-0.6B

image

Llama-3.2-3B-Instruct

image

Non Co-located + Sync

Qwen/Qwen3-0.6B

image

Llama-3.2-3B-Instruct

image

Async

Qwen/Qwen3-0.6B

image

Llama-3.2-3B-Instruct

image

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

@github-actions
Copy link

ℹ️ File Consistency Check

Check based on commit: d5b08fc (PR #1702 from ruit/lora_grpo_dtensor)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

  • nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.


This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@github-actions
Copy link

ℹ️ File Consistency Check

Check based on commit: c3452a8 (PR #1702 from ruit/lora_grpo_dtensor)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

  • nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.


This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@github-actions
Copy link

ℹ️ File Consistency Check

Check based on commit: 26c34a3 (PR #1702 from ruit/lora_grpo_dtensor)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

  • nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.


This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@RayenTian RayenTian force-pushed the ruit/lora_grpo_dtensor branch from 26c34a3 to 306161f Compare December 31, 2025 03:29
@github-actions
Copy link

ℹ️ File Consistency Check

Check based on commit: 306161f (PR #1702 from ruit/lora_grpo_dtensor)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

  • nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.


This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@RayenTian RayenTian added the CI:L0 Run doctests and unit tests label Dec 31, 2025
Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
Signed-off-by: ruit <ruit@nvidia.com>
@RayenTian RayenTian force-pushed the ruit/lora_grpo_dtensor branch from 306161f to 1e31a15 Compare January 2, 2026 08:59
@RayenTian RayenTian removed the CI:L0 Run doctests and unit tests label Jan 2, 2026
@github-actions
Copy link

github-actions bot commented Jan 2, 2026

ℹ️ File Consistency Check

Check based on commit: 1e31a15 (PR #1702 from ruit/lora_grpo_dtensor)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

  • nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.


This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@RayenTian RayenTian added the CI:L0 Run doctests and unit tests label Jan 2, 2026
@github-actions
Copy link

github-actions bot commented Jan 2, 2026

ℹ️ File Consistency Check

Check based on commit: 7c908be (PR #1702 from ruit/lora_grpo_dtensor)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

  • nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.


This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

Signed-off-by: ruit <ruit@nvidia.com>
@RayenTian RayenTian force-pushed the ruit/lora_grpo_dtensor branch from 7c908be to 4443948 Compare January 2, 2026 11:20
@RayenTian RayenTian added CI:L0 Run doctests and unit tests and removed CI:L0 Run doctests and unit tests labels Jan 2, 2026
@github-actions
Copy link

github-actions bot commented Jan 2, 2026

ℹ️ File Consistency Check

Check based on commit: 4443948 (PR #1702 from ruit/lora_grpo_dtensor)

✅ DTensor Policy Worker Synchronization Check

Both DTensor policy worker files were modified in this PR:

  • nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py

Please ensure that the changes are consistent between both files where applicable.


This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L0 Run doctests and unit tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LoRa DTensor GPRO

2 participants