forked from huggingface/transformers
-
Notifications
You must be signed in to change notification settings - Fork 16
Open
Description
Environment info
transformersversion: 4.49.0- Platform: Windows
- Python version: 3.12.0
- PyTorch version (GPU): 2.6.0+cu118
Library:
- text generation: @patrickvonplaten
Information
Model I am using Mistral https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2 ;
Python enviroment:
numpy==2.2.3
onnx==1.17.0
onnxruntime==1.21.0
optimum==1.24.0
tokenizers==0.21.1
torch==2.6.0+cu118 # cuda is required if converting to fp16 onnx
torchaudio==2.6.0+cu118
torchvision==0.21.0+cu118
transformers==4.49.0 # transformers==4.49.0 has a bug when transferring LLM to onnx, 4.48.3 has no issue
To reproduce
Steps to reproduce the behavior:
- optimum-cli export onnx --model mistralai/Mistral-7B-Instruct-v0.2 --no-dynamic-axes --batch_size 1 --task text-generation-with-past --dtype=fp16 --device cuda -- onnx_models\mistralai_Mistral-7B-Instruct-v0.2
- You will error message below
- After adding
onnx::Gather_67manually with code snippet at https://github.com/huggingface/optimum/blob/main/optimum/exporters/onnx/convert.py#L350, another issue came
gather_tensor = torch.tensor(0, dtype=torch.int64)
onnx_inputs['onnx::Gather_67'] = gather_tensor.cpu().numpy()
Expected behavior
If switching transformers into version 4.48.3, there is no issues.
Metadata
Metadata
Assignees
Labels
No labels

