Skip to content

Vitis EP incorrectly assigns nodes to devices #322

@0x484558

Description

@0x484558

Hi, my company bought Ryzen 7 AI 350 laptop and an AI engineer is trying to figure out how Ryzen AI is even supposed to be used and how to bring own fine-tuned model to run on NPU, but we are being startled with the state of software stack and the fact that it is closed shuts down any ability to contribute a fix for it. Official gte-large-en-v1.5-bf16 example revealed severe defects in Ryzen AI software.

We had discovered that it is not trivial to bring custom models to run with Vitis and there's a giant amount of corner-cases which the documentation carefully avoids documenting to remain presentable, which is misleading. Trying to bring our own model, we attempted numerous surgeries on the model, ensured that all shapes are fixed, batch size is 1 (#309). But toolings severely lacks any debugging provisions that would tell why a particular operation fails (I assume this is relevant to #318). Nodes were not assigned to NPU at all which turned out to be an issue of its own (#324) and we had hope for two days that this has to be some mistake.

We managed to provide gte-large-en-v1.5-bf16 example with a correct environment with ancient version of torch, and that allowed us to reveal severe defects in the initial compilation process. Using the same code from the example, we managed to export our model (based on qwen2 architecture) but it only reinforced doubts about the correctness of the table of supported ops and the presence of severe defects in Ryzen AI software, I assume this is also the reason why reference to ops support page is not easily provided. Substituting gte for own model in the example would result in this trace:

I20260110 18:40:33.984828 13148 vitisai_compile_model.cpp:1263] Vitis AI EP Load ONNX Model Success
I20260110 18:40:33.984828 13148 vitisai_compile_model.cpp:1264] Graph Input Node Name/Shape (2)
I20260110 18:40:33.984828 13148 vitisai_compile_model.cpp:1268]          input_ids : [1x4096]
I20260110 18:40:33.984828 13148 vitisai_compile_model.cpp:1268]          attention_mask : [1x4096]
I20260110 18:40:33.984828 13148 vitisai_compile_model.cpp:1274] Graph Output Node Name/Shape (1)
I20260110 18:40:33.984828 13148 vitisai_compile_model.cpp:1278]          embeddings : [1x1024]
[Vitis AI EP] No. of Operators :   CPU     7  VAIML  2039
[Vitis AI EP] No. of Subgraphs :   NPU     1 Actually running on NPU      1
Traceback (most recent call last):
  File "C:\Users\[REDACTED]\RyzenAI-SW-main\example\gte-large-en-v1.5-bf16\run.py", line 90, in <module>
    main(args)
  File "C:\Users\[REDACTED]\RyzenAI-SW-main\example\gte-large-en-v1.5-bf16\run.py", line 13, in main
    npu_session = ort.InferenceSession(
                  ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\[REDACTED]\.conda\envs\onnxtools\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 485, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "C:\Users\[REDACTED]\.conda\envs\onnxtools\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 584, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for Sub(14) node with name '/base_model/Sub'

(The above is NOT a report for gte model, we just used literal example code to test our model there)

We tried inspecting reports generated for both gte and our model and found that the exception above is literally a lie, because gte model also has Sub and in fact it is also offloaded to CPU.

gte: report.json

In our report, Sub is being put to VAIML device, which causes a failure: report-2.json

So what exactly is going on here? Apart from obviously incorrect ops support table with respect to NPU, Vitis compilation process is incorrectly assigning operators.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions