You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
My question is what is the difference between this type of quantization (for int8)
Using quantize dynamic (which is int8 AFAIU):
Using the MatMulNBits quantizer, which can also be enabled to quantize to int8?
Also, I do see a different size of the models (based of the original Qwen2.5-0.5B fp32 => 2400M):
Both should be in int8 but the size differs a lot:
For the model_quant.onnx => 610M
For the model_int8.onnx => 1018M
Beta Was this translation helpful? Give feedback.
All reactions