-
Notifications
You must be signed in to change notification settings - Fork 451
Description
Describe the bug
Using TFLiteConverter
, I initially generated a TFLite model, utilizing its default optimization setting, which quantizes the models. Then, I used tf2onnx
to convert it to ONNX. I set no --dequantize
flag. However, the optimizer dequantized the model, without giving any indication for that. I would at least expect a warning that this feature is utilized implicitly, given that it is also marked as experimental, or that at least this behavior is documented.
Urgency
Low Urgency.
System Information
Ubuntu 22.04
TF Version: 2.12
ONNX Version: 1.16
To Reproduce
I observed this when applied to MobileNetV2
, ResNet101
, and InceptionV3
models, trying both using PyTorch
and Tensorflow
pretrained models. Select any of these models from PyTorch
or TensorFlow
, and convert it to TFLite
following the DEFAULT
optimization strategy (details). Then, use tf2onnx
to convert this quantized model to ONNX.
If you inspect using Netron, you will see that the resulted model will have FP weights and biases, but slightly different from the original PyTorch
/TensorFlow
model.