Error with running qwen-3b-instruct converted

Hello,

I have two questions:
1. regarding kv-lora-rank parameter - I do not understand why for the qwen-3b default parameter value does NOT work, but for qwen-7b it works
2. running converted models

I have been trying to run you conversion scripts, yet once the model is converted - I cannot run them (neither with transformers not with vLLM).

`model_path=Qwen/Qwen2.5-3B-Instruct 
save_path=outputs/qwen2_5-3B-Instruct-MLA
eval_batch_size=8
python transmla/converter.py \
    --model-path $model_path \
    --save-path $save_path \
    --freqfold 4 \
    --ppl-eval-batch-size $eval_batch_size \
    --kv-lora-rank 192`

As you can see i modified given qwen2.5-7B-Instruct.sh command to add `--kv-lora-rank 192` (otherwise there is an error:

> assert self.kv_lora_rank <= 2 * self.latent_dim - self.qk_mqa_dim, f"kv_lora_rank ({self.kv_lora_rank}) must be less than 2 * latent_dim ({self.latent_dim}) - qk_mqa_dim ({self.qk_mqa_dim})"

Which I also do not understand, since for qwen2.5-7B-Instruct.sh it works just fine (Q1).

When I modify kv-lora-rank for qwen-3b and set it to 192, I get the converted model, BUT I cannot run it. Here is the error that I get (Q2):
`INFO 09-07 15:08:31 [loader.py:458] Loading weights took 0.93 seconds
ERROR 09-07 15:08:31 [core.py:387] EngineCore hit an exception: Traceback (most recent call last):
ERROR 09-07 15:08:31 [core.py:387]   File "/opt/python3.12/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 378, in run_engine_core
ERROR 09-07 15:08:31 [core.py:387]     engine_core = EngineCoreProc(*args, **kwargs)
ERROR 09-07 15:08:31 [core.py:387]                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 09-07 15:08:31 [core.py:387]   File "/opt/python3.12/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 320, in __init__
ERROR 09-07 15:08:31 [core.py:387]     super().__init__(vllm_config, executor_class, log_stats)
ERROR 09-07 15:08:31 [core.py:387]   File "/opt/python3.12/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 67, in __init__
ERROR 09-07 15:08:31 [core.py:387]     self.model_executor = executor_class(vllm_config)
ERROR 09-07 15:08:31 [core.py:387]                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 09-07 15:08:31 [core.py:387]   File "/opt/python3.12/lib/python3.12/site-packages/vllm/executor/executor_base.py", line 52, in __init__
ERROR 09-07 15:08:31 [core.py:387]     self._init_executor()
ERROR 09-07 15:08:31 [core.py:387]   File "/opt/python3.12/lib/python3.12/site-packages/vllm/executor/uniproc_executor.py", line 47, in _init_executor
ERROR 09-07 15:08:31 [core.py:387]     self.collective_rpc("load_model")
ERROR 09-07 15:08:31 [core.py:387]   File "/opt/python3.12/lib/python3.12/site-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
ERROR 09-07 15:08:31 [core.py:387]     answer = run_method(self.driver_worker, method, args, kwargs)
ERROR 09-07 15:08:31 [core.py:387]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 09-07 15:08:31 [core.py:387]   File "/opt/python3.12/lib/python3.12/site-packages/vllm/utils.py", line 2378, in run_method
ERROR 09-07 15:08:31 [core.py:387]     return func(*args, **kwargs)
ERROR 09-07 15:08:31 [core.py:387]            ^^^^^^^^^^^^^^^^^^^^^
ERROR 09-07 15:08:31 [core.py:387]   File "/opt/python3.12/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 136, in load_model
ERROR 09-07 15:08:31 [core.py:387]     self.model_runner.load_model()
ERROR 09-07 15:08:31 [core.py:387]   File "/opt/python3.12/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1279, in load_model
ERROR 09-07 15:08:31 [core.py:387]     self.model = get_model(vllm_config=self.vllm_config)
ERROR 09-07 15:08:31 [core.py:387]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 09-07 15:08:31 [core.py:387]   File "/opt/python3.12/lib/python3.12/site-packages/vllm/model_executor/model_loader/__init__.py", line 14, in get_model
ERROR 09-07 15:08:31 [core.py:387]     return loader.load_model(vllm_config=vllm_config)
ERROR 09-07 15:08:31 [core.py:387]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 09-07 15:08:31 [core.py:387]   File "/opt/python3.12/lib/python3.12/site-packages/vllm/model_executor/model_loader/loader.py", line 467, in load_model
ERROR 09-07 15:08:31 [core.py:387]     raise ValueError(
ERROR 09-07 15:08:31 [core.py:387] ValueError: Following weights were not initialized from checkpoint: {'lm_head.weight'}
ERROR 09-07 15:08:31 [core.py:387] 
CRITICAL 09-07 15:08:31 [core_client.py:359] Got fatal signal from worker processes, shutting down. See stack trace above for root cause issue.
Killed`

If I add a flag --deepseek-style then None of the weights match the architecture.



System parameters:
reqs:
vllm==0.8.4
transformers==4.52.4
datasets
accelerate==1.3.0
datatrove
tensorboardX

GPU:
NVIDIA H100 80GB HBM3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error with running qwen-3b-instruct converted #35

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error with running qwen-3b-instruct converted #35

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions