Skip to content

Conversation

@ZiyueXu77
Copy link
Collaborator

Fixes # .

Description

From job api to recipe

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Quick tests passed locally by running ./runtest.sh.
  • In-line docstrings updated.
  • Documentation updated.

Copilot AI review requested due to automatic review settings December 11, 2025 17:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR converts the LLM HuggingFace example from using the imperative Job API to the declarative Recipe API, aligning with NVFLARE's preferred pattern for job configuration. The refactoring encapsulates the job configuration logic into a reusable LLMHFRecipe class while maintaining the same functional behavior.

Key changes:

  • Introduced LLMHFRecipe class that wraps the existing job configuration logic
  • Refactored main() function to instantiate the recipe and delegate execution
  • Streamlined argument parser help text for consistency
  • Fixed spelling error in client_ids help text

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 11, 2025

Greptile Overview

Greptile Summary

Refactored LLM HuggingFace example from imperative job API to Recipe pattern, wrapping the FedJob configuration logic in a LLMHFRecipe class that extends the base Recipe class.

Key changes:

  • Added LLMHFRecipe class that encapsulates all job configuration in __init__
  • Moved FedJob creation, controller setup, quantization filters, model persistor, and client runner configuration into the recipe constructor
  • Improved code organization by consolidating logic that was previously scattered in the main() function
  • Updated job names to include "_recipe" suffix (llm_hf_sft_recipe, llm_hf_peft_recipe)
  • Added type hints (List[str], Optional[str]) for better type safety
  • Enhanced error messages to mention case-insensitive support
  • Improved argument parser help text for clarity
  • Fixed ports default to be a list ["7777"] instead of string "7777" for consistency

Functional equivalence:

  • All existing functionality is preserved
  • Single-GPU, multi-GPU, and multi-node training modes continue to work as before
  • Quantization, WandB integration, and all other features remain intact
  • The refactored code follows the same pattern as other recently converted examples (sklearn-linear, sklearn-svm, sklearn-kmeans, random_forest)

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - it's a clean refactoring that follows established patterns
  • This refactoring follows the exact same pattern used in recent Recipe conversions (sklearn-linear, sklearn-svm, sklearn-kmeans, random_forest). All existing functionality is preserved, with no breaking changes to the API. The code quality is improved with better organization, type hints, and clearer help text. The changes are well-contained to a single file and maintain backward compatibility at the command-line level.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
examples/advanced/llm_hf/job.py 5/5 Converted from imperative job API to Recipe class pattern, wrapping FedJob configuration in LLMHFRecipe.__init__, improving reusability and maintainability

Sequence Diagram

sequenceDiagram
    participant User as User/main()
    participant Recipe as LLMHFRecipe
    participant FedJob as FedJob
    participant Env as ExecEnv (SimEnv/ProdEnv)
    participant Run as Run

    User->>User: Parse arguments
    User->>User: Split GPUs & validate
    User->>Recipe: __init__(client_ids, num_rounds, etc.)
    activate Recipe
    Recipe->>FedJob: Create FedJob(name, min_clients)
    Recipe->>FedJob: to(FedAvg controller, "server")
    
    alt quantize_mode specified
        Recipe->>FedJob: to(ModelQuantizer, "server")
        Recipe->>FedJob: to(ModelDequantizer, "server")
    end
    
    Recipe->>FedJob: to(model_file, "server")
    Recipe->>FedJob: to(PTFileModelPersistor, "server")
    Recipe->>FedJob: to(IntimeModelSelector, "server")
    
    loop for each client
        Recipe->>FedJob: to(ScriptRunner, client_site)
        alt quantize_mode
            Recipe->>FedJob: to(quantizer/dequantizer, client_site)
        end
        Recipe->>FedJob: to(client_params, client_site)
    end
    
    Recipe->>Recipe: super().__init__(job)
    deactivate Recipe
    
    User->>FedJob: export_job(job_dir)
    
    alt startup_kit_location provided
        User->>Env: Create ProdEnv
    else simulation mode
        User->>Env: Create SimEnv
    end
    
    User->>Recipe: execute(env)
    Recipe->>Env: deploy(job)
    Env-->>Recipe: job_id
    Recipe->>Run: Create Run(env, job_id)
    Recipe-->>User: run
    
    User->>Run: get_status()
    Run-->>User: status
    User->>Run: get_result()
    Run-->>User: result
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@holgerroth holgerroth marked this pull request as draft December 16, 2025 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant