how to serve onnx based models on the web via rest api #23165

aria3ppp · 2024-12-20T09:10:36Z

aria3ppp
Dec 20, 2024

is there any least affort way to serve the models on the web? i mean llama.cpp have llama.server tool that deploy a openai api like rest api? is there any option here for onnxruntime?

tarekziade · 2024-12-23T09:06:18Z

tarekziade
Dec 23, 2024

That would be another project that wraps onnxruntime in a web service.

I know Triton has a backend for onnx https://github.com/triton-inference-server/onnxruntime_backend
And I saw this project on Github https://github.com/kibae/onnxruntime-server

But never tried them

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

how to serve onnx based models on the web via rest api #23165

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

how to serve onnx based models on the web via rest api #23165

Uh oh!

aria3ppp Dec 20, 2024

Replies: 1 comment

Uh oh!

tarekziade Dec 23, 2024

aria3ppp
Dec 20, 2024

tarekziade
Dec 23, 2024