Skip to content

gte-large-en-v1.5-bf16 example is very aged #325

@0x484558

Description

@0x484558

Official gte-large-en-v1.5-bf16 example is completely broken and relies on tools that are not pre-installed in the ryzen-ai-1.6.1 conda environment. torch is not pre-installed and running download_model.py fails. But even if model download and export succeeds (though that requires manually fixing the script) the next step that is to run model with run.py does not factually work at all with a hard error:

>python run.py --model_path models/gte-large-en-v1.5.onnx
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Traceback (most recent call last):
  File "C:\Users\[REDACTED]\RyzenAI-SW-main\example\gte-large-en-v1.5-bf16\run.py", line 89, in <module>
    main(args)
  File "C:\Users\[REDACTED]\RyzenAI-SW-main\example\gte-large-en-v1.5-bf16\run.py", line 13, in main
    npu_session = ort.InferenceSession(
                  ^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\miniconda3\envs\ryzenai\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 485, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "C:\ProgramData\miniconda3\envs\ryzenai\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 573, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from models/gte-large-en-v1.5.onnx failed:This is an invalid model. In Node, ("node_Split_18", Split, "", -1) : ("linear": tensor(float),) -> ("split_split_0": tensor(float),"split_split_1": tensor(float),"split_split_2": tensor(float),) , Error Unrecognized attribute: num_outputs for operator Split

Vitis execution provider explicitly can only run ONNX opset 17 models exported with ancient Torch (in fact, versions that are not even installable via Conda anymore, around 2.2.0). This is highly concerning because Torch export for ONNX was really bad before Dynamo was introduced and Vitis platform seems to be hard-stuck there, which means even new and upcoming Ryzen NPUs platform will not scale beyond generation of models around Qwen2 severely limiting the usefulness of Ryzen AI platform. Modern versions of Torch cannot produce ONNX Runtime with opset 17 at all, as 18 is the minimum bar and Dynamo is the go-to method that produces meaningful graphs unlike legacy exporting with constant folding.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions