Skip to content

Conversation

@AmeenP
Copy link
Collaborator

@AmeenP AmeenP commented Jan 6, 2026

Summary

Ensure a checkpoint is always saved at the end of training, regardless of save_steps interval.

Problem

Short training runs (max_steps < save_steps) produce no checkpoints because HuggingFace Trainer only saves at save_steps intervals. The cleanup job then finds no step_* folder and deletes the adapter record.

Solution

Add _save_final_checkpoint() method that:

  1. Runs in _inner_training_loop finally block (before orchestrator cleanup)
  2. Saves to broadcasts/step_{final_step}/ directory
  3. Skips if checkpoint already exists at that step
  4. Only runs on main process

Changes

  • Add import os for path operations
  • Add _save_final_checkpoint() method
  • Call in _inner_training_loop finally block

Related

Ensure a checkpoint is saved at the end of training, regardless of
save_steps interval. This fixes the issue where short runs
(max_steps < save_steps) would produce no adapters.

Changes:
- Add _save_final_checkpoint() method to RLTrainer
- Call it in _inner_training_loop finally block before cleanup
- Save to broadcasts/step_N/ to match cleanup script expectations
@AmeenP AmeenP closed this Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants