-
Notifications
You must be signed in to change notification settings - Fork 0
feat(deploy): add DeepSeek R1 MI300 data-parallel inference preset #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
moai-inference-preset에는 InferenceServiceTemplate만 들어갈거에요. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds DeepSeek R1 inference presets for MI300X GPUs with data-parallel inference support, along with a collection of base vLLM data-parallel templates for the Moreh inference framework.
- Introduces DeepSeek R1 model configuration for both prefill and decode workers with MI300X-specific optimizations
- Adds reusable base templates for vLLM data-parallel deployments including core functionality, metadata labeling, offline HuggingFace hub support, and decode proxy configuration
- Organizes templates in a subdirectory structure (
base/anddeepseek-r1/) for better maintainability
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
deploy/helm/moai-inference-preset/templates/deepseek-r1/vllm-deepseek-r1-prefill-mi300x-dp8ep.helm.yaml |
DeepSeek R1 prefill worker configuration with MI300X-specific environment variables and vLLM settings |
deploy/helm/moai-inference-preset/templates/deepseek-r1/vllm-deepseek-r1-decode-mi300x-dp8ep.helm.yaml |
DeepSeek R1 decode worker configuration with MI300X-specific environment variables and vLLM settings |
deploy/helm/moai-inference-preset/templates/base/vllm-dp.helm.yaml |
Base vLLM data-parallel template with complete pod specification including container setup, health checks, and volume configuration |
deploy/helm/moai-inference-preset/templates/base/vllm-dp-prefill-meta.helm.yaml |
Metadata template for labeling prefill workers |
deploy/helm/moai-inference-preset/templates/base/vllm-dp-hf-hub-offline.helm.yaml |
Configuration for offline HuggingFace Hub usage with persistent volume mount |
deploy/helm/moai-inference-preset/templates/base/vllm-dp-decode-proxy.helm.yaml |
Decode worker proxy configuration with initialization container and port offset settings |
deploy/helm/moai-inference-preset/templates/base/vllm-dp-decode-meta.helm.yaml |
Metadata template for labeling decode workers |
.../moai-inference-preset/templates/deepseek-r1/vllm-deepseek-r1-prefill-mi300x-dp8ep.helm.yaml
Outdated
Show resolved
Hide resolved
.../moai-inference-preset/templates/deepseek-r1/vllm-deepseek-r1-prefill-mi300x-dp8ep.helm.yaml
Outdated
Show resolved
Hide resolved
...m/moai-inference-preset/templates/deepseek-r1/vllm-deepseek-r1-decode-mi300x-dp8ep.helm.yaml
Outdated
Show resolved
Hide resolved
...m/moai-inference-preset/templates/deepseek-r1/vllm-deepseek-r1-decode-mi300x-dp8ep.helm.yaml
Outdated
Show resolved
Hide resolved
.../moai-inference-preset/templates/deepseek-r1/vllm-deepseek-r1-prefill-mi300x-dp8ep.helm.yaml
Outdated
Show resolved
Hide resolved
...m/moai-inference-preset/templates/deepseek-r1/vllm-deepseek-r1-decode-mi300x-dp8ep.helm.yaml
Outdated
Show resolved
Hide resolved
…eepseek-r1-prefill-mi300x-dp8ep.helm.yaml Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
우선 preset안에 폴더 만들어서 넣어놨는데 구조적으로 수정 할 부분이 필요하다면 커멘트 남겨주세요