[DO NOT MERGE, WIP] feat: ray-less SFT #4

emergenz · 2025-12-21T17:24:04Z

No description provided.

maharajamihir

tested up to qwen 32B. works.

emergenz · 2025-12-30T14:20:58Z

I'm just wondering whether we should wait for upstream lora support (for the default codepath + RL), or not.

For some reason the seem to be constantly reverting their lora implementation lol

emergenz · 2025-12-30T14:21:56Z

I don't think they found a 'proper' solution for lora-based RL with sglang yet as the current proposal still unloads and reloads the lora adapter from disk (surely we can do better than that).

emergenz · 2025-12-30T14:23:56Z

Question is whether we should:

Merge this PR now, wait for them to properly support lora and then rebase upstream into main (which will likely be a pain due to the overlapping changes)
Or, defer merging this PR for now, wait until proper lora support lands in upstream, merge upstream into this branch and squash merge into main

* feat: support lora in ray-less SFT codepath * chore: add assert * chore: fail fast if no lora-compatible modules found * fix: set default lora dropout to 0 * fix: skip dataloader checkpoint loading if non-existent * chore: change lora defaults * feat: checkpoint in run script * feat: separate load and save paths * fix: only store trainable params in optimizer * fix: only load adapter weights on lora restore * feat: support lora in the ray-full SFT codepath (+assert that grad check. is off)

emergenz · 2025-12-31T16:53:52Z

Merging now that we have a working workflow to keep in sync with upstream.

emergenz · 2025-12-31T17:40:01Z

Actually, let's first merge #10 into this branch and then merge the entire thing into main.

That way, we frontload most of the conflict resolution down the line when lora is finally merged into upstream main.

emergenz requested a review from maharajamihir December 21, 2025 17:24

maharajamihir approved these changes Dec 30, 2025

View reviewed changes

emergenz and others added 6 commits December 31, 2025 17:47

feat: ray-less SFT

4fa16b9

feat: checkpoint dataloader in ray-less SFT codepath

1c50a11

feat: Rayless sft val loss (#5)

8eb054f

chore: save interval=val interval

0136722

feat: support lora + gradient checkpointing

e15e2a3

emergenz force-pushed the rayless-sft branch from 74f4820 to e15e2a3 Compare December 31, 2025 16:52

emergenz changed the title ~~feat: ray-less SFT~~ [DO NOT MERGE, WIP] feat: ray-less SFT Dec 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DO NOT MERGE, WIP] feat: ray-less SFT #4

[DO NOT MERGE, WIP] feat: ray-less SFT #4

Uh oh!

emergenz commented Dec 21, 2025

Uh oh!

maharajamihir left a comment

Uh oh!

emergenz commented Dec 30, 2025

Uh oh!

emergenz commented Dec 30, 2025

Uh oh!

emergenz commented Dec 30, 2025

Uh oh!

emergenz commented Dec 31, 2025

Uh oh!

emergenz commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[DO NOT MERGE, WIP] feat: ray-less SFT #4

Are you sure you want to change the base?

[DO NOT MERGE, WIP] feat: ray-less SFT #4

Uh oh!

Conversation

emergenz commented Dec 21, 2025

Uh oh!

maharajamihir left a comment

Choose a reason for hiding this comment

Uh oh!

emergenz commented Dec 30, 2025

Uh oh!

emergenz commented Dec 30, 2025

Uh oh!

emergenz commented Dec 30, 2025

Uh oh!

emergenz commented Dec 31, 2025

Uh oh!

emergenz commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants