Mamba Hybrid Models #520

clairesonglee · 2026-01-29T00:43:53Z

No description provided.

primus/backends/megatron/core/models/hybrid/hybrid_block.py

+
+        # Ensure that the tensor passed between pipeline parallel stages is
+        # viewless. See related notes in TransformerBlock and TransformerLayer
+        output = make_viewless_tensor(


primus/backends/megatron/core/models/hybrid/hybrid_block.py

+from megatron.core.inference.contexts import BaseInferenceContext
+from megatron.core.process_groups_config import ProcessGroupCollection
+from megatron.core.ssm.mamba_hybrid_layer_allocation import Symbols as LayerSymbols
+from megatron.core.ssm.mamba_hybrid_layer_allocation import allocate_layers


primus/backends/megatron/core/models/hybrid/hybrid_mamba_mla_layer_specs.py

+from megatron.core.transformer.identity_op import IdentityOp
+from megatron.core.fusions.fused_bias_dropout import get_bias_dropout_add
+from megatron.core.models.gpt.moe_module_specs import get_moe_module_spec
+from megatron.core.ssm.mamba_block import MambaStack, MambaStackSubmodules


clairesonglee and others added 7 commits January 28, 2026 16:33

initial commit

80e2d26

set self.lr_warmup_steps < self.lr_decay_steps

d23d79f

unwrap model to remove loss_mask parameter

3381850

add zebra-llama (hybrid mla mamba model) support

277f3e1

add Zebra-Llama 3B configurations

11b22c6

add Zebra-Llama 1B configs and remove unused configs

1dec95e

remove unused configs

34508db

github-code-quality bot found potential problems Jan 29, 2026

View reviewed changes

Set submodule mamba to track enable-primus-hybrid-models branch

2f3ab49

clairesonglee force-pushed the clairlee/dev/hybrid branch from 4b8a061 to 2f3ab49 Compare January 30, 2026 21:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mamba Hybrid Models #520

Mamba Hybrid Models #520

Uh oh!

clairesonglee commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Mamba Hybrid Models #520

Are you sure you want to change the base?

Mamba Hybrid Models #520

Uh oh!

Conversation

clairesonglee commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants