-
Notifications
You must be signed in to change notification settings - Fork 111
Description
Official gte-large-en-v1.5-bf16 example is completely broken and relies on tools that are not pre-installed in the ryzen-ai-1.6.1 conda environment. torch is not pre-installed and running download_model.py fails. But even if model download and export succeeds (though that requires manually fixing the script) the next step that is to run model with run.py does not factually work at all with a hard error:
>python run.py --model_path models/gte-large-en-v1.5.onnx
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Traceback (most recent call last):
File "C:\Users\[REDACTED]\RyzenAI-SW-main\example\gte-large-en-v1.5-bf16\run.py", line 89, in <module>
main(args)
File "C:\Users\[REDACTED]\RyzenAI-SW-main\example\gte-large-en-v1.5-bf16\run.py", line 13, in main
npu_session = ort.InferenceSession(
^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\miniconda3\envs\ryzenai\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 485, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "C:\ProgramData\miniconda3\envs\ryzenai\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 573, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from models/gte-large-en-v1.5.onnx failed:This is an invalid model. In Node, ("node_Split_18", Split, "", -1) : ("linear": tensor(float),) -> ("split_split_0": tensor(float),"split_split_1": tensor(float),"split_split_2": tensor(float),) , Error Unrecognized attribute: num_outputs for operator Split
Vitis execution provider explicitly can only run ONNX opset 17 models exported with ancient Torch (in fact, versions that are not even installable via Conda anymore, around 2.2.0). This is highly concerning because Torch export for ONNX was really bad before Dynamo was introduced and Vitis platform seems to be hard-stuck there, which means even new and upcoming Ryzen NPUs platform will not scale beyond generation of models around Qwen2 severely limiting the usefulness of Ryzen AI platform. Modern versions of Torch cannot produce ONNX Runtime with opset 17 at all, as 18 is the minimum bar and Dynamo is the go-to method that produces meaningful graphs unlike legacy exporting with constant folding.