Skip to content

Quantized TFLite models converted to ONNX are implicitly dequantized #2394

@luludak

Description

@luludak

Describe the bug

Using TFLiteConverter, I initially generated a TFLite model, utilizing its default optimization setting, which quantizes the models. Then, I used tf2onnx to convert it to ONNX. I set no --dequantize flag. However, the optimizer dequantized the model, without giving any indication for that. I would at least expect a warning that this feature is utilized implicitly, given that it is also marked as experimental, or that at least this behavior is documented.

Urgency
Low Urgency.

System Information
Ubuntu 22.04
TF Version: 2.12
ONNX Version: 1.16

To Reproduce
I observed this when applied to MobileNetV2, ResNet101, and InceptionV3 models, trying both using PyTorch and Tensorflow pretrained models. Select any of these models from PyTorch or TensorFlow, and convert it to TFLite following the DEFAULT optimization strategy (details). Then, use tf2onnx to convert this quantized model to ONNX.

If you inspect using Netron, you will see that the resulted model will have FP weights and biases, but slightly different from the original PyTorch/TensorFlow model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugAn unexpected problem or unintended behavior

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions