Skip to content

Conversation

@bongwoobak
Copy link
Contributor

우선 preset안에 폴더 만들어서 넣어놨는데 구조적으로 수정 할 부분이 필요하다면 커멘트 남겨주세요

@hhk7734
Copy link
Member

hhk7734 commented Jan 5, 2026

moai-inference-preset에는 InferenceServiceTemplate만 들어갈거에요.
아래와 같은 구조로 변경해주세요

base/
  - workertemplate...
deepseek-r1/
  - deepseek-r1...

@hhk7734
Copy link
Member

hhk7734 commented Jan 5, 2026

  • end-to-end (co-located, prefill + decode)
    • deepseek-r1-mi300x-tp8
    • deepseek-r1-mi300x-dp8ep
  • prefill only
    • deepseek-r1-prefill-mi300x-tp8
  • decode only
    • deepseek-r1-decode-mi308x-tp8

hhk7734
hhk7734 previously approved these changes Jan 5, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds DeepSeek R1 inference presets for MI300X GPUs with data-parallel inference support, along with a collection of base vLLM data-parallel templates for the Moreh inference framework.

  • Introduces DeepSeek R1 model configuration for both prefill and decode workers with MI300X-specific optimizations
  • Adds reusable base templates for vLLM data-parallel deployments including core functionality, metadata labeling, offline HuggingFace hub support, and decode proxy configuration
  • Organizes templates in a subdirectory structure (base/ and deepseek-r1/) for better maintainability

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
deploy/helm/moai-inference-preset/templates/deepseek-r1/vllm-deepseek-r1-prefill-mi300x-dp8ep.helm.yaml DeepSeek R1 prefill worker configuration with MI300X-specific environment variables and vLLM settings
deploy/helm/moai-inference-preset/templates/deepseek-r1/vllm-deepseek-r1-decode-mi300x-dp8ep.helm.yaml DeepSeek R1 decode worker configuration with MI300X-specific environment variables and vLLM settings
deploy/helm/moai-inference-preset/templates/base/vllm-dp.helm.yaml Base vLLM data-parallel template with complete pod specification including container setup, health checks, and volume configuration
deploy/helm/moai-inference-preset/templates/base/vllm-dp-prefill-meta.helm.yaml Metadata template for labeling prefill workers
deploy/helm/moai-inference-preset/templates/base/vllm-dp-hf-hub-offline.helm.yaml Configuration for offline HuggingFace Hub usage with persistent volume mount
deploy/helm/moai-inference-preset/templates/base/vllm-dp-decode-proxy.helm.yaml Decode worker proxy configuration with initialization container and port offset settings
deploy/helm/moai-inference-preset/templates/base/vllm-dp-decode-meta.helm.yaml Metadata template for labeling decode workers

bongwoobak and others added 4 commits January 5, 2026 21:37
…eepseek-r1-prefill-mi300x-dp8ep.helm.yaml

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@hhk7734 hhk7734 merged commit adda148 into main Jan 5, 2026
3 checks passed
@hhk7734 hhk7734 deleted the deepseek-dp branch January 5, 2026 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants