Replies: 3 comments 1 reply
-
|
Hi @qiuzhewei . Thanks for your question! Currently NNCF doesn't save weights as int8 after the quantization of an ONNX model (unlike Torch or OV). We're working on it and hopefully it will be ready by next NNCF release. |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
@MaximProshin Thanks for reply! The problem persists when I am trying to convert quantized model using torch. The |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
This PR addresses the issue with INT8 ONNX model size: #3662 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I am trying to convert model to int8 following https://github.com/openvinotoolkit/nncf/blob/develop/examples/post_training_quantization/onnx/mobilenet_v2/main.py
The original model size on disk is 140M. But the converted model in INT8 mode is also 141M. I suppose that should be 1/4 of model in FP32. Below is the script I use to convert the model.
Beta Was this translation helpful? Give feedback.
All reactions