Skip to content

Conversation

@hyeongjun-jeon
Copy link

@hyeongjun-jeon hyeongjun-jeon commented Oct 23, 2025

PR description

pipeline_assign 과 비슷하게, gradient checkpoint 를 어느 시점에 해줄 지 torch.moreh.checkpoint_assign() 을 통해 할 수 있다. 이를 GPT2, Mistral 코드에 적용하였다.

Moreh framework 의 https://github.com/moreh-dev/framework/commit/744f3476de06509ddcd7382928b971134c00d9d2#diff-fd7e20039cc03d4c907ed4ec098d41a9e74fe6db7423e80aca8cd54888a3a8fa 커밋 참고.

관련 Jira issue link: https://moreh.atlassian.net/browse/MAF-18899

@hyeongjun-jeon hyeongjun-jeon merged commit aca147c into v4.42.4-moreh Oct 23, 2025
3 checks passed
@hyeongjun-jeon hyeongjun-jeon deleted the MAF-18899-gradient-checkpointing branch October 23, 2025 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants