Quantized TFLite models converted to ONNX are implicitly dequantized

**Describe the bug**

Using `TFLiteConverter`, I initially generated a TFLite model, utilizing its default optimization setting, which quantizes the models. Then, I used `tf2onnx` to convert it to ONNX. I set no `--dequantize` flag. However, the optimizer dequantized the model, without giving any indication for that. I would at least expect a warning that this feature is utilized implicitly, given that it is also marked as experimental, or that at least this behavior is documented.

**Urgency**
Low Urgency.

**System Information**
Ubuntu 22.04
TF Version: 2.12
ONNX Version: 1.16

**To Reproduce**
I observed this when applied to `MobileNetV2`, `ResNet101`, and `InceptionV3` models, trying both using `PyTorch` and `Tensorflow` pretrained models. Select any of these models from `PyTorch` or `TensorFlow`, and convert it to `TFLite` following the `DEFAULT` optimization strategy ([details](https://www.tensorflow.org/api_docs/python/tf/lite/Optimize)). Then, use `tf2onnx` to convert this quantized model to ONNX.

If you inspect using Netron, you will see that the resulted model will have FP weights and biases, but slightly different from the original `PyTorch`/`TensorFlow` model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Quantized TFLite models converted to ONNX are implicitly dequantized #2394

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Quantized TFLite models converted to ONNX are implicitly dequantized #2394

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions