Skip to content

fix: on GB200 use single-thread checkpoint save to avoid Cpu OOM#1703

Merged
terrykong merged 4 commits intoNVIDIA-NeMo:mainfrom
guyueh1:fix_gb200_dpsk_ckpt
Jan 5, 2026
Merged

fix: on GB200 use single-thread checkpoint save to avoid Cpu OOM#1703
terrykong merged 4 commits intoNVIDIA-NeMo:mainfrom
guyueh1:fix_gb200_dpsk_ckpt

Commits

Commits on Dec 28, 2025

Commits on Jan 3, 2026

Commits on Jan 5, 2026