Skip to content

Conversation

@yuki-97
Copy link
Contributor

@yuki-97 yuki-97 commented Dec 15, 2025

  1. Split vllm dependency of FSDP backend, only use vllm==0.11.2 itself instead of full vllm venv which includes many other things like deepgemm.
  2. Split fp8 files to make train backend won't rely on vllm. (no functional change)

Summary by CodeRabbit

  • New Features

    • Added FSDP backend support with vLLM-assisted variants for multiple worker types.
  • Refactor

    • Reorganized FP8 quantization utilities to dedicated modules.
    • Updated backend selection logic for policy workers.
    • Separated vLLM dependencies into a dedicated training group.
    • Cleaned up collective initialization code.

✏️ Tip: You can customize this high-level summary in your review settings.

@yuki-97 yuki-97 requested review from a team as code owners December 15, 2025 14:50
@github-actions
Copy link

⚠️ File Consistency Check

Check based on commit: 33ea56d (PR #1638 from yukih/split-vllm-dependency)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

  • Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency
  • If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

  • Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
  • Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@yuki-97 yuki-97 added the CI:L1 Run doctests, unit tests, and functional tests label Dec 15, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 15, 2025

📝 Walkthrough

Walkthrough

The pull request reorganizes vLLM-related utilities across the codebase and adds support for multiple execution backends with vLLM variants. It moves FP8 calibration helper functions, introduces new backend executables, updates the actor/worker registry to support vLLM-specific variants, and includes temporary workarounds for backend selection logic. The changes decouple train workers from vLLM by supporting alternative backends through configurable mappings.

Changes

Cohort / File(s) Summary
FP8 Utility Reorganization
nemo_rl/models/generation/vllm/quantization/fp8.py, nemo_rl/models/generation/vllm/quantization/fp8_train_utils.py
Removed get_vllm_qkv_scale_names and convert_calibration_to_vllm_format from fp8.py and added them as new functions in fp8_train_utils.py for vLLM-compatible calibration conversion.
vLLM Backend Import Updates
nemo_rl/models/generation/vllm/vllm_backend.py, nemo_rl/models/generation/vllm/vllm_worker.py
Updated import paths for FP8 utilities to use the new location in nemo_rl.models.generation.vllm.quantization.* module hierarchy.
Megatron Worker FP8 Imports
nemo_rl/models/policy/workers/megatron_policy_worker.py
Moved FP8 utility imports from top-level to local imports within specific functions (_iter_params_with_optional_kv_scales and calibrate_qkv_fp8_scales), updating to import from new fp8_train_utils location.
Backend Executables Configuration
nemo_rl/distributed/virtual_cluster.py
Added new PY_EXECUTABLES attributes: FSDP, FSDP_VLLM, AUTOMODEL_VLLM, and MCORE_VLLM extending respective base commands with vllm_for_train extra dependency.
Dependency Management
pyproject.toml
Added new fsdp optional dependency group and introduced separate vllm_for_train group with vllm==0.11.0; removed vllm==0.11.0 from vllm and mcore groups.
Actor Registry Mappings
nemo_rl/distributed/ray_actor_environment_registry.py
Updated ACTOR_ENVIRONMENT_REGISTRY to map DTensor/Megatron policy workers to FSDP backends; added new vLLM-variant mappings (DTensorPolicyWorker-vllm, DTensorPolicyWorkerV2-vllm, MegatronPolicyWorker-vllm) with MCORE_VLLM_EXECUTABLE constant.
Worker Class Name Sanitization
nemo_rl/distributed/worker_groups.py
Added temporary workaround in create_worker to strip hyphen-suffixed portions from class names (e.g., DTensorPolicyWorker-vllmDTensorPolicyWorker) to resolve worker class instantiation.
Policy Backend Selection
nemo_rl/models/policy/lm_policy.py
Added conditional logic to append -{backend} suffix to worker_builder_cls when generation backend is not colocated, enabling backend-specific worker variant selection.
DTensor Collective Initialization Removal
nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
Removed init_collective method that previously set up NCCL-based collective communication, simplifying explicit collective initialization.
Documentation Configuration
pyrefly.toml
Added nemo_rl/models/generation/vllm/quantization/fp8_train_utils.py to project-includes lists for documentation generation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Areas requiring special attention:

  • Import correctness across FP8 utilities: Verify all import paths from fp8.pyfp8_train_utils.py are correctly updated in all calling functions (especially vLLM backend, worker, and Megatron paths)
  • Actor registry mapping consistency: Ensure all new backend variant mappings (*-vllm suffixes) correctly resolve to their corresponding PY_EXECUTABLES constants and that the temporary class-name-stripping workaround handles all expected variants
  • Dependency resolution: Validate that the new vllm_for_train dependency group and restructured fsdp optional group don't create missing transitive dependencies or conflicts with existing virtual environments
  • Backend selection logic: Review the conditional suffix appending in lm_policy.py to ensure it correctly identifies non-colocated backends and doesn't inadvertently affect colocated configurations

Possibly related PRs

Suggested labels

backend-support, vllm, refactoring, worker-management

Suggested reviewers

  • terrykong
  • guyueh1
  • yfw
  • parthchadha

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR makes major functional changes (dependency split, collective removal, worker registry changes) with no documented test results, performance data, or regression verification. Add test results demonstrating dependency split correctness, worker instantiation, collective communication removal safety, and FP8 quantization functionality verification.
✅ Passed checks (5 passed)
Check name Status Explanation
Linked Issues check ✅ Passed The PR addresses the objective from #501 by decoupling training workers from vllm dependency through dependency restructuring and code refactoring, though complete ray collective integration is not evident in these changes.
Out of Scope Changes check ✅ Passed All changes are directly related to splitting vllm dependencies and decoupling vllm from core training logic. File reorganization, import updates, and worker class modifications all support the stated objective.
Docstring Coverage ✅ Passed Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: split dtensorv1 vllm dependency' accurately describes the primary objective of the changeset: splitting the vllm dependency to enable non-colocated generation and decouple training workers from vllm-specific dependencies.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch yukih/split-vllm-dependency

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
nemo_rl/distributed/worker_groups.py (1)

169-172: Hyphen-suffix stripping cleanly decouples worker class resolution from backend tagging

Using class_name = class_name.split("-")[0] lets you encode backend variants in the FQN (e.g., ...DTensorPolicyWorker-vllm) for environment selection while still importing the underlying class (DTensorPolicyWorker) without any code changes to that class. This is backward compatible for existing FQNs and matches the temporary workaround described in the comment; just keep this in sync with the naming used in Policy and ray_actor_environment_registry.

nemo_rl/models/policy/lm_policy.py (1)

162-167: Backend-suffixed worker class name cleanly plugs into env selection, but assumes full generation config

Appending -backend to worker_builder_cls when generation.colocated.enabled is False is a reasonable temporary workaround that pairs well with the FQN mangling in RayWorkerBuilder and the updated actor environment registry. It does, however, assume that config["generation"] is non-None and contains both "colocated" and "backend"; if you ever construct a Policy without a fully-populated generation section, this will raise. If you expect such configs, consider guarding with a local generation_cfg = config.get("generation") or {} and raising a clearer error when required keys are missing.

nemo_rl/models/policy/workers/megatron_policy_worker.py (1)

2474-2631: calibrate_qkv_fp8_scales now returns vLLM-style keys; update docs to match

Importing convert_calibration_to_vllm_format from fp8_train_utils and feeding result_layers into it means the "layers" field in final_result is now a flat dict keyed by vLLM parameter names (e.g., model.layers.N.self_attn.k_scale), not by "layer_N" entries as described in the docstring. The implementation is consistent with the rest of the FP8/vLLM tooling, but the docstring should be updated to document the new structure so downstream callers know they can pass final_result["layers"] directly into KV-scale–aware paths.

nemo_rl/models/generation/vllm/quantization/fp8_train_utils.py (1)

51-96: Consider more robust parsing of layer_<idx> keys in calibration conversion

Right now layer_idx = int(layer_key.split("_")[1]) assumes a strict "layer_<int>" format; malformed keys will throw somewhat opaque IndexError/ValueError. If you want slightly safer behavior, you could validate and re-raise with a clearer message:

-    vllm_scales = {}
-    for layer_key, scales in calibration_results.items():
-        # Extract layer index from "layer_N" format
-        layer_idx = int(layer_key.split("_")[1])
+    vllm_scales = {}
+    for layer_key, scales in calibration_results.items():
+        # Extract layer index from keys like "layer_0"
+        _, _, idx_str = layer_key.partition("_")
+        try:
+            layer_idx = int(idx_str)
+        except ValueError as exc:
+            raise ValueError(
+                f"Invalid layer key {layer_key!r}, expected format 'layer_<int>'."
+            ) from exc

This keeps failures explicit if the upstream calibration format changes.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a010564 and 33ea56d.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (12)
  • nemo_rl/distributed/ray_actor_environment_registry.py (1 hunks)
  • nemo_rl/distributed/virtual_cluster.py (1 hunks)
  • nemo_rl/distributed/worker_groups.py (1 hunks)
  • nemo_rl/models/generation/vllm/quantization/fp8.py (0 hunks)
  • nemo_rl/models/generation/vllm/quantization/fp8_train_utils.py (1 hunks)
  • nemo_rl/models/generation/vllm/vllm_backend.py (2 hunks)
  • nemo_rl/models/generation/vllm/vllm_worker.py (1 hunks)
  • nemo_rl/models/policy/lm_policy.py (1 hunks)
  • nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py (0 hunks)
  • nemo_rl/models/policy/workers/megatron_policy_worker.py (2 hunks)
  • pyproject.toml (2 hunks)
  • pyrefly.toml (1 hunks)
💤 Files with no reviewable changes (2)
  • nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
  • nemo_rl/models/generation/vllm/quantization/fp8.py
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.py: Conform code to Python 3.12+
Indent code with 4 spaces. Do not use tabs
Use snake_case for file names
Use PascalCase for class names
Use snake_case for function and method names
Use snake_case for local variables
Prefix variable names that start with a number with 'k' (e.g., k_99th_percentile)
Use upper snake_case with 'G' prefix for global variables (e.g., G_MY_GLOBAL)
Use upper snake_case for constants
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
Prefer docstrings over comments for interfaces that may be used outside a file
Reserve comments for code within a function or interfaces that are local to a file
If a piece of code is commented out, include a comment describing its usage and why it's commented out. Remove debug comments before merging
Use Google style docstrings for classes and functions in Python, which can be parsed by Sphinx
Avoid using reflection when functionality can be easily achieved without reflection
When using try-except blocks, limit the except clause to the smallest set of specific errors possible
When using try-except blocks for duck-typing, keep the body of the try as small as possible and use the else block for logic
YAML is the single source of truth for configuration defaults. Do not set non-None defaults in code for configuration values
For required configuration attributes, access config directly and expect presence (e.g., policy_cfg['precision']) without hidden defaults
Use typing.NotRequired to mark optional attributes in TypedDict for configuration
When adding a new config key to a TypedDict subclass, document the key's purpose, valid values/types, and recommended default, and reflect the default in exemplar YAMLs under examples/configs/*.yaml
Follow the Google Python Style Guide for Python code

Files:

  • nemo_rl/distributed/virtual_cluster.py
  • nemo_rl/models/generation/vllm/quantization/fp8_train_utils.py
  • nemo_rl/models/generation/vllm/vllm_backend.py
  • nemo_rl/models/generation/vllm/vllm_worker.py
  • nemo_rl/models/policy/lm_policy.py
  • nemo_rl/distributed/worker_groups.py
  • nemo_rl/models/policy/workers/megatron_policy_worker.py
  • nemo_rl/distributed/ray_actor_environment_registry.py
nemo_rl/**/*.py

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

For any source file under nemo_rl/*.py that defines a class or function decorated with @ray.remote, add a coverage pragma (# pragma: no cover) because these run in separate Ray processes

Files:

  • nemo_rl/distributed/virtual_cluster.py
  • nemo_rl/models/generation/vllm/quantization/fp8_train_utils.py
  • nemo_rl/models/generation/vllm/vllm_backend.py
  • nemo_rl/models/generation/vllm/vllm_worker.py
  • nemo_rl/models/policy/lm_policy.py
  • nemo_rl/distributed/worker_groups.py
  • nemo_rl/models/policy/workers/megatron_policy_worker.py
  • nemo_rl/distributed/ray_actor_environment_registry.py
!(**/tests/**|**/test_*.py|**/test_*.sh)

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Add the NVIDIA copyright header to all Python files and shell scripts (excluding tests). The header should include the current year

Files:

  • nemo_rl/distributed/virtual_cluster.py
  • nemo_rl/models/generation/vllm/quantization/fp8_train_utils.py
  • nemo_rl/models/generation/vllm/vllm_backend.py
  • nemo_rl/models/generation/vllm/vllm_worker.py
  • nemo_rl/models/policy/lm_policy.py
  • nemo_rl/distributed/worker_groups.py
  • nemo_rl/models/policy/workers/megatron_policy_worker.py
  • nemo_rl/distributed/ray_actor_environment_registry.py
  • pyproject.toml
  • pyrefly.toml
**/*.{py,sh}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

The NVIDIA copyright header should appear at the top of all Python files and shell scripts (excluding tests)

Files:

  • nemo_rl/distributed/virtual_cluster.py
  • nemo_rl/models/generation/vllm/quantization/fp8_train_utils.py
  • nemo_rl/models/generation/vllm/vllm_backend.py
  • nemo_rl/models/generation/vllm/vllm_worker.py
  • nemo_rl/models/policy/lm_policy.py
  • nemo_rl/distributed/worker_groups.py
  • nemo_rl/models/policy/workers/megatron_policy_worker.py
  • nemo_rl/distributed/ray_actor_environment_registry.py
🧬 Code graph analysis (3)
nemo_rl/models/generation/vllm/vllm_worker.py (1)
nemo_rl/models/generation/vllm/quantization/fp8.py (1)
  • init_fp8 (158-254)
nemo_rl/models/policy/workers/megatron_policy_worker.py (1)
nemo_rl/models/generation/vllm/quantization/fp8_train_utils.py (2)
  • get_vllm_qkv_scale_names (16-48)
  • convert_calibration_to_vllm_format (51-96)
nemo_rl/distributed/ray_actor_environment_registry.py (1)
nemo_rl/distributed/virtual_cluster.py (1)
  • PY_EXECUTABLES (43-68)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Lint check
  • GitHub Check: Post submodule check comment / Comment on PR
🔇 Additional comments (8)
nemo_rl/models/generation/vllm/vllm_backend.py (1)

180-187: FP8 utilities correctly re-pointed to vLLM-specific quantization module

Updating the FP8 imports to nemo_rl.models.generation.vllm.quantization.fp8 keeps the weight-loading logic identical while routing through the new vLLM-specific helper module, which matches the goal of isolating vLLM-related FP8 code. No issues from this change alone.

Also applies to: 233-239

nemo_rl/models/generation/vllm/vllm_worker.py (1)

291-299: Scoped FP8 initialization import is consistent with new module layout

Importing init_fp8 from nemo_rl.models.generation.vllm.quantization.fp8 only when precision == "fp8" aligns this worker with the refactored FP8 utilities and avoids unnecessary imports on non-FP8 runs. Behavior remains the same assuming init_fp8’s API is unchanged.

pyproject.toml (1)

55-76: Splitting FSDP and training vLLM extras aligns with backend-specific executables

Defining an fsdp extra and a separate vllm_for_train extra cleanly supports new executables like FSDP, FSDP_VLLM, AUTOMODEL_VLLM, and MCORE_VLLM while keeping the generic vllm extra available for inference-only flows. The duplicate vllm==0.11.0 pin across vllm and vllm_for_train is fine but should be maintained in lockstep when upgrading.

Please confirm in your environment that:

  • uv run --locked --extra fsdp --extra vllm_for_train ... resolves correctly, and
  • the intended worker/py-executable mappings actually use fsdp vs vllm_for_train as designed.
pyrefly.toml (1)

100-107: Including fp8_train_utils.py in pyrefly scope is appropriate

Adding nemo_rl/models/generation/vllm/quantization/fp8_train_utils.py to project-includes keeps the new FP8 training helpers under static analysis and matches the refactor that moved these utilities out of the generic FP8 module.

nemo_rl/distributed/virtual_cluster.py (1)

52-66: *New FSDP and _VLLM executables match the split extras model

Introducing PY_EXECUTABLES.FSDP and the FSDP_VLLM/AUTOMODEL_VLLM/MCORE_VLLM variants provides a clear separation between pure training backends and those that also need vLLM installed on the training side via vllm_for_train. The string composition (... --extra fsdp ... --extra vllm_for_train) looks correct for uv’s CLI.

Please confirm that:

  • uv accepts multiple --extra flags in this form, and
  • the updated ray_actor_environment_registry actually routes the appropriate workers to FSDP, FSDP_VLLM, AUTOMODEL_VLLM, and MCORE_VLLM as intended.
nemo_rl/models/policy/workers/megatron_policy_worker.py (1)

2056-2073: Local FP8 QKV scale-name import reduces coupling and matches the new helper module

Pulling get_vllm_qkv_scale_names from nemo_rl.models.generation.vllm.quantization.fp8_train_utils at function scope in _iter_params_with_optional_kv_scales keeps this worker dependent only on NeMo-RL’s own FP8 utilities and avoids any implicit vLLM import at module import time. Combined with the existing logic for emitting KV-scale tensors, this cleanly supports the new calibration-to-vLLM naming flow.

nemo_rl/models/generation/vllm/quantization/fp8_train_utils.py (1)

16-48: Centralized vLLM QKV scale naming looks good

The helper cleanly encapsulates the vLLM naming convention (including the extra .attn segment for q_scale) and should help prevent drift from vLLM’s expected parameter structure.

nemo_rl/distributed/ray_actor_environment_registry.py (1)

26-28: vLLM-specific policy worker executables are wired consistently

Defining MCORE_VLLM_EXECUTABLE and routing the *-vllm DTensor/Automodel/Megatron policy workers to the corresponding *_VLLM executables matches the existing pattern (VLLM_EXECUTABLE, MCORE_EXECUTABLE) and keeps the temporary vLLM-for-train environments clearly separated. Comments documenting the workaround are helpful.

Also applies to: 33-40

guyueh1
guyueh1 previously approved these changes Dec 15, 2025
Copy link
Contributor

@guyueh1 guyueh1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to me for the fp8 changes

@github-actions
Copy link

⚠️ File Consistency Check

Check based on commit: e1f107c (PR #1638 from yukih/split-vllm-dependency)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

  • Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency
  • If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

  • Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
  • Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@yuki-97 yuki-97 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Dec 20, 2025
@yuki-97 yuki-97 force-pushed the yukih/split-vllm-dependency branch from e1f107c to 5cd8d8d Compare December 22, 2025 05:45
@yuki-97 yuki-97 changed the title fix: split vllm dependency fix: split dtensorv1 vllm dependency Dec 22, 2025
@github-actions
Copy link

⚠️ File Consistency Check

Check based on commit: 5cd8d8d (PR #1638 from yukih/split-vllm-dependency)

⚠️ DTensor Policy Worker Synchronization Warning

The file nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py was modified in this PR, but nemo_rl/models/policy/workers/dtensor_policy_worker.py was not updated.

Why this matters:
These files contain related DTensor policy worker implementations that should be kept synchronized to ensure consistency across different versions.

Action required:

  • Please review if the changes in nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py should also be applied to nemo_rl/models/policy/workers/dtensor_policy_worker.py
  • Update nemo_rl/models/policy/workers/dtensor_policy_worker.py if necessary to maintain consistency
  • If the files are intentionally different, please add a comment in the PR explaining why

Files to check:

  • Modified: nemo_rl/models/policy/workers/dtensor_policy_worker_v2.py
  • Not modified: nemo_rl/models/policy/workers/dtensor_policy_worker.py

This check ensures that related file implementations remain synchronized across the codebase. If you believe this warning is incorrect or the files should intentionally differ, please add a comment explaining the reasoning.

@yuki-97 yuki-97 removed the CI:L1 Run doctests, unit tests, and functional tests label Dec 22, 2025
@yuki-97 yuki-97 added the CI:L1 Run doctests, unit tests, and functional tests label Dec 22, 2025
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
@yuki-97 yuki-97 force-pushed the yukih/split-vllm-dependency branch from 5cd8d8d to 9d49097 Compare January 5, 2026 05:39
@yuki-97 yuki-97 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Jan 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L1 Run doctests, unit tests, and functional tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants