Skip to content

[Web] Deformable-DETR inference takes 12 seconds on WebGPU, 0.05s in Python (CUDAProvider) #22425

@Dexterp37

Description

@Dexterp37

Describe the issue

I'm trying to run inference on a Deformable-DETR model trained using HF Transformers and converted to onnx. Running inference using onnxruntime-gpu in Python works like a charm, with the expected performance (0.05s). Running the same model in onnxruntime-web using the WebGPU provider takes a bit less than using WASM, about 12 seconds (after warmup).

I tried to follow the docs to collect some additional information and I get quite a few instances like these:

ort-wasm-simd-threaded.jsep.wasm:0x1039a7f 2024-10-13 21:42:56.842600 [V:onnxruntime:Default, js_execution_provider.cc:735 JsExecutionProvider] Graph capture enable: 0
ort-wasm-simd-threaded.jsep.wasm:0x1039a7f 2024-10-13 21:42:57.538899 [I:onnxruntime:Default, fallback_cpu_capability.cc:86 operator()] Candidate for fallback CPU execution: /model/model/input_proj.3/input_proj.3.1/Reshape_1

Unfortunately the analyzing the profiling data section is under construction, so I'm not sure how to act on the above information. Any help appreciated!

To reproduce

The converted model file is available here.

The code to reproduce the problem is available as a gist here: simply load the model and it will attempt to run inference twice and measure the time.

Urgency

The world is not going to end, but my research project is blocked on this 😢

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.20.0-dev.20241012-332173509d

Execution Provider

'webgpu' (WebGPU)

Metadata

Metadata

Assignees

No one assigned

    Labels

    ep:WebGPUort-web webgpu providermodel:transformerissues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.platform:webissues related to ONNX Runtime web; typically submitted using templatestaleissues that have not been addressed in a while; categorized by a bot

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions