-
Notifications
You must be signed in to change notification settings - Fork 228
Convert JobAPI to Recipe for Kaplan-Meier example #3894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR converts the Kaplan-Meier homomorphic encryption example from the JobAPI to the Recipe API, enabling both simulation and production deployment modes. The key changes include:
- Replaced
km_job.pywith a new Recipe-basedjob.pythat supports both simulation and production environments - Added production deployment infrastructure including provisioning configuration (
project.yml) and convenience scripts (start_all.sh) - Updated controller initialization to explicitly pass empty
persistor_idparameter - Enhanced documentation with comprehensive instructions for both deployment modes
Reviewed changes
Copilot reviewed 7 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| examples/advanced/kaplan-meier-he/job.py | New Recipe-based job configuration replacing JobAPI implementation |
| examples/advanced/kaplan-meier-he/km_job.py | Removed old JobAPI implementation |
| examples/advanced/kaplan-meier-he/server.py | Updated controller initialization with explicit persistor_id |
| examples/advanced/kaplan-meier-he/server_he.py | Updated controller initialization with explicit persistor_id |
| examples/advanced/kaplan-meier-he/project.yml | New provisioning configuration for production deployment with CKKS HE scheme |
| examples/advanced/kaplan-meier-he/start_all.sh | Convenience script for starting all production components locally |
| examples/advanced/kaplan-meier-he/README.md | Comprehensive documentation update covering both simulation and production modes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Greptile SummaryThis PR successfully converts the Kaplan-Meier example from the old JobAPI (
Critical Issue: The HE context path handling has a logic bug that will break production mode. The code uses a hardcoded simulation path ( Minor Issue: The Confidence Score: 2/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant JobPy as job.py (Recipe)
participant Server as Server (KM_HE)
participant Client1 as Client 1
participant Client2 as Client 2
Note over User,Client2: Production Mode Setup
User->>User: Run provisioning with project.yml
User->>Server: Start server with HE context
User->>Client1: Start client-1 with HE context
User->>Client2: Start client-2 with HE context
Note over User,Client2: Job Execution
User->>JobPy: python job.py --encryption --startup_kit_location ...
JobPy->>JobPy: Create KMRecipe with KM_HE controller
JobPy->>Server: Submit job via ProdEnv
Note over Server,Client2: Round 1 - Collect Max Index
Server->>Client1: Send empty message (round=1)
Server->>Client2: Send empty message (round=1)
Client1->>Client1: Calculate local max histogram index
Client2->>Client2: Calculate local max histogram index
Client1->>Server: Send max_idx
Client2->>Server: Send max_idx
Server->>Server: Aggregate max_idx (select global max)
Note over Server,Client2: Round 2 - Collect Encrypted Histograms
Server->>Client1: Send max_idx_global (round=2)
Server->>Client2: Send max_idx_global (round=2)
Client1->>Client1: Create histograms with global length
Client1->>Client1: Encrypt histograms with CKKS
Client2->>Client2: Create histograms with global length
Client2->>Client2: Encrypt histograms with CKKS
Client1->>Server: Send encrypted hist_obs, hist_cen
Client2->>Server: Send encrypted hist_obs, hist_cen
Server->>Server: Aggregate encrypted histograms (HE addition)
Note over Server,Client2: Round 3 - Distribute Global Results
Server->>Client1: Send encrypted global histograms (round=3)
Server->>Client2: Send encrypted global histograms (round=3)
Client1->>Client1: Decrypt global histograms
Client1->>Client1: Perform Kaplan-Meier analysis
Client1->>Client1: Save km_curve_fl_he.png
Client2->>Client2: Decrypt global histograms
Client2->>Client2: Perform Kaplan-Meier analysis
Client2->>Client2: Save km_curve_fl_he.png
Client1->>Server: Send completion acknowledgment
Client2->>Server: Send completion acknowledgment
Server->>JobPy: Job complete
JobPy->>User: Display job status
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (1)
-
examples/advanced/kaplan-meier-he/start_all.sh, line 71 (link)style:
stop_all.shscript doesn't existUpdate to mention using the admin console instead:
9 files reviewed, 1 comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (4)
-
examples/advanced/kaplan-meier-he/client_he.py, line 140-141 (link)logic: HE scheme mismatch - code uses BFV (
ts.bfv_vector) butproject.ymlprovisions CKKS scheme for production. Production mode will fail when it tries to use BFV functions with CKKS context. Either changeproject.ymlscheme to BFV or update client/server code to use scheme-agnostic operations. -
examples/advanced/kaplan-meier-he/server_he.py, line 95-97 (link)logic: HE scheme mismatch - code uses BFV (
ts.bfv_vector_from) butproject.ymlprovisions CKKS scheme for production. Production mode will fail at deserialization. Either changeproject.ymlscheme to BFV or update code to be scheme-agnostic. -
examples/advanced/kaplan-meier-he/project.yml, line 50 (link)logic: CKKS scheme incompatible with code -
client_he.pyandserver_he.pyuse BFV-specific functions (ts.bfv_vector,ts.bfv_vector_from) for integer histogram operations. Change to BFV scheme or update code. -
examples/advanced/kaplan-meier-he/job.py, line 143 (link)logic: string replacement will fail if
--workspace_dirdoesn't contain "/km/". usepathlibor explicit path construction instead.
9 files reviewed, 4 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (4)
-
examples/advanced/kaplan-meier-he/job.py, line 143 (link)logic: String replacement fails when
workspace_dirends with/km(no trailing slash). The default value/tmp/nvflare/workspaces/kmwon't match the pattern/km/. -
examples/advanced/kaplan-meier-he/project.yml, line 3 (link)syntax: Comment is incorrect - both simulation and production modes now use CKKS scheme. The code was converted from BFV to CKKS in
client_he.pyandserver_he.py. -
examples/advanced/kaplan-meier-he/README.md, line 76 (link)syntax: Documentation is outdated - the implementation now uses CKKS scheme, not BFV. Update to reflect the actual scheme change in
client_he.pyandserver_he.py. -
examples/advanced/kaplan-meier-he/README.md, line 85 (link)syntax: Documentation is outdated - the implementation now uses CKKS scheme, not BFV. Update to reflect the actual scheme change.
10 files reviewed, 4 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 files reviewed, no comments
| print("Exporting job to", args.job_dir) | ||
| recipe.job.export_job(args.job_dir) | ||
|
|
||
| # Run recipe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too many prints cover the main flow of the code
| # In production mode, job runs asynchronously on the FL system | ||
| # Check status via admin console instead of waiting for result here | ||
| if args.startup_kit_location: | ||
| print("\nJob submitted successfully to the FL system!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the code suppose to be simple, this looks complicated now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move these no import statements to a separate utils functions
|
|
||
| # Run recipe | ||
| if args.startup_kit_location: | ||
| print("Running job in production mode...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need to know this information
| env = ProdEnv(startup_kit_location=args.startup_kit_location, username=args.username) | ||
| else: | ||
| print("Running job in simulation mode...") | ||
| print("workspace_dir=", workspace_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we print out all these ?
|
/build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (2)
-
examples/advanced/kaplan-meier-he/job.py, line 39-51 (link)logic: HE context path won't work in production mode. The README (line 268) states the
he_context_pathis "automatically managed by startup kits" in production, but this implementation uses a hardcoded simulation path/tmp/nvflare/he_context/he_context_client.txtfor both modes. In production, the HE context files are at<startup_kit>/client_context.tenseal(for clients) and<startup_kit>/server_context.tenseal(for server), not at this simulation path. This will cause production mode to fail when trying to load HE context. How should the HE context paths be automatically determined in production mode? -
examples/advanced/kaplan-meier-he/job.py, line 66-68 (link)style: Missing
params_exchange_formatparameter. The oldkm_job.pyincludedparams_exchange_format="raw"in ScriptRunner configuration. While this may not cause runtime errors, omitting it could lead to unexpected behavior if the default format differs from what the client scripts expect.
10 files reviewed, 2 comments
Fixes # .
Description
Convert KM example from JobAPI to Recipe, also add production instructions with provisioned HE context
Types of changes
./runtest.sh.