-
Notifications
You must be signed in to change notification settings - Fork 38
Description
Greetings from the WasmEdge runtime maintenance team,
The WASI-NN API effectively supports OpenVINO, PyTorch, and TensorFlow-Lite backends. However, the design of the set-input and get-output APIs isn't in sync with the standard TensorFlow C API usage.
TensorFlow C API Inference
The TensorFlow backend, like others, employs the TF_SessionRun API for computation. However, in contrast to OpenVINO and PyTorch, this API requires the specification of output tensors during execution.
Whereas in PyTorch the sequence would look like this:
load(path)
ctx = init_execution_context()
set_input(ctx, index1, tensor1)
set_input(ctx, index2, tensor2)
compute(ctx)
out_tensor1 = get_output(ctx, index1)
out_tensor2 = get_output(ctx, index2)
out_tensor3 = get_output(ctx, index3)In TensorFlow, it would need to be:
load(path)
ctx = init-execution-context()
set_input(ctx, index1, tensor1)
set_input(ctx, index2, tensor2)
set_output(ctx, index1, out_tensor1)
set_output(ctx, index2, out_tensor2)
set_output(ctx, index3, out_tensor3)
compute(ctx)
# Outputs are filled post-computationOf course, the original invocation sequence works. But at the implementation level, it will cause repeated computation as TF_SessionRun would have to be invoked during the compute phase.
Additionally, unlike OpenVINO, PyTorch, and TensorFlow-Lite, which support index-based input/output tensor selection, TensorFlow only offers the TF_GraphOperationByName API to obtain input and output operations. Hence, the sequence in TensorFlow would need to include names rather than indexes:
load(path)
ctx = init_execution_context()
set_input_by_name(ctx, name1, tensor1)
set_input_by_name(ctx, name2, tensor2)
set_output_by_name(ctx, name1, out_tensor1)
set_output_by_name(ctx, name2, out_tensor2)
set_output_by_name(ctx, name3, out_tensor3)
compute(ctx)
# Outputs are filled post-computationProposed Specification Changes
To incorporate the TensorFlow backend and balance developer/user experience with performance, we suggest considering the following functions:
set_input_by_name
Parameters:
ctx: handlename: stringtensor: tensor-data
Expected result:
- expected<
unit,error>
set_output_by_name
Parameters:
ctx: handlename: stringtensor: tensor-data buffer for receiving output
Expected result:
- expected<(),
error>
unload
A function required to relinquish loaded resources. We have a FaaS use scenario to register and de-register the loaded models.
Parameters:
graph: handle
Expected result:
- expected<(),
error>
finalize_execution_context
A function is needed to release execution contexts.
Parameters:
ctx: handle
Expected result:
- expected<(),
error>
Final Thoughts
While the existing WASI-NN API supports the backends as mentioned above, we encountered challenges when trying to implement the TensorFlow backend. Consequently, we advocate for this feature. We would appreciate any suggestions for refining these APIs, as our familiarity with TensorFlow may not be comprehensive. Thank you!