You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[QNN EP] Make offloading graph input/output quantization (to CPU) the default (#23368)
### Description
Makes the QNN provider option `offload_graph_io_quantization` enabled by
default. It was previously disabled by default.
### Motivation and Context
Enabling this option significantly decreases inference latency for many
models.
0 commit comments