Skip to content

export failure for tflite with options (--half and --int8) #5446

@raccoondev85

Description

@raccoondev85

Search before asking

  • I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

Export

Bug

1. export failed when use gpu device for --half option

python export.py --half --weights ./export_tflite/yolov5n/best.pt --include tflite --device 0

LOG

export: data=data/coco128.yaml, weights=./export_tflite/yolov5n/best.pt, imgsz=[640, 640], batch_size=1, device=0, half=True, inplace=False, train=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=13, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['tflite']
YOLOv5 🚀 v6.0-42-g4c0982a torch 1.9.0+cu102 CUDA:0 (Quadro RTX 4000, 7982.3125MB)

Fusing layers...
Model Summary: 213 layers, 1764577 parameters, 0 gradients, 4.2 GFLOPs

PyTorch: starting from export_tflite/yolov5n/best.pt (14.7 MB)
2021-11-02 13:03:19.661502: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0

TensorFlow saved_model: starting export with tensorflow 2.4.1...

             from  n    params  module                                  arguments                     

TensorFlow saved_model: export failure: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

TensorFlow Lite: starting export with tensorflow 2.4.1...

TensorFlow Lite: export failure: 'NoneType' object has no attribute 'call'

2. export failed when use --int8 option

python export.py --int8 --weights ./export_tflite/yolov5n/best.pt --include tflite

LOG

export: data=data/coco128.yaml, weights=./export_tflite/yolov5n/best.pt, imgsz=[640, 640], batch_size=1, device=cpu, half=False, inplace=False, train=False, optimize=False, int8=True, dynamic=False, simplify=False, opset=13, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['tflite']
YOLOv5 🚀 v6.0-42-g4c0982a torch 1.9.0+cu102 CPU

Fusing layers...
Model Summary: 213 layers, 1764577 parameters, 0 gradients, 4.2 GFLOPs

PyTorch: starting from export_tflite/yolov5n/best.pt (14.7 MB)
2021-11-02 13:08:43.302145: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0

TensorFlow saved_model: starting export with tensorflow 2.4.1...

             from  n    params  module                                  arguments                     

2021-11-02 13:08:44.301976: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-11-02 13:08:44.303109: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-11-02 13:08:44.334274: E tensorflow/stream_executor/cuda/cuda_driver.cc:328] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2021-11-02 13:08:44.334350: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: 19367d16ac94
2021-11-02 13:08:44.334358: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: 19367d16ac94
2021-11-02 13:08:44.335550: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 470.57.2
2021-11-02 13:08:44.335585: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 470.57.2
2021-11-02 13:08:44.335591: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:310] kernel version seems to match DSO: 470.57.2
2021-11-02 13:08:44.336461: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-11-02 13:08:44.338266: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
0 -1 1 1760 models.common.Conv [3, 16, 6, 2, 2]
1 -1 1 4672 models.common.Conv [16, 32, 3, 2]
2 -1 1 4800 models.common.C3 [32, 32, 1]
3 -1 1 18560 models.common.Conv [32, 64, 3, 2]
4 -1 1 29184 models.common.C3 [64, 64, 2]
5 -1 1 73984 models.common.Conv [64, 128, 3, 2]
6 -1 1 156928 models.common.C3 [128, 128, 3]
7 -1 1 295424 models.common.Conv [128, 256, 3, 2]
8 -1 1 296448 models.common.C3 [256, 256, 1]
9 -1 1 164608 models.common.SPPF [256, 256, 5]
10 -1 1 33024 models.common.Conv [256, 128, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 90880 models.common.C3 [256, 128, 1, False]
14 -1 1 8320 models.common.Conv [128, 64, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 22912 models.common.C3 [128, 64, 1, False]
18 -1 1 36992 models.common.Conv [64, 64, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 74496 models.common.C3 [128, 128, 1, False]
21 -1 1 147712 models.common.Conv [128, 128, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 296448 models.common.C3 [256, 256, 1, False]
24 [17, 20, 23] 1 12177 models.yolo.Detect [4, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 128, 256], [640, 640]]
Model: "model"


Layer (type) Output Shape Param # Connected to

input_1 (InputLayer) [(1, 640, 640, 3)] 0


tf_conv (TFConv) (1, 320, 320, 16) 1744 input_1[0][0]


tf_conv_1 (TFConv) (1, 160, 160, 32) 4640 tf_conv[0][0]


tf_c3 (TFC3) (1, 160, 160, 32) 4704 tf_conv_1[0][0]


tf_conv_7 (TFConv) (1, 80, 80, 64) 18496 tf_c3[0][0]


tf_c3_1 (TFC3) (1, 80, 80, 64) 28928 tf_conv_7[0][0]


tf_conv_15 (TFConv) (1, 40, 40, 128) 73856 tf_c3_1[0][0]


tf_c3_2 (TFC3) (1, 40, 40, 128) 156288 tf_conv_15[0][0]


tf_conv_25 (TFConv) (1, 20, 20, 256) 295168 tf_c3_2[0][0]


tf_c3_3 (TFC3) (1, 20, 20, 256) 295680 tf_conv_25[0][0]


tfsppf (TFSPPF) (1, 20, 20, 256) 164224 tf_c3_3[0][0]


tf_conv_33 (TFConv) (1, 20, 20, 128) 32896 tfsppf[0][0]


tf_upsample (TFUpsample) (1, 40, 40, 128) 0 tf_conv_33[0][0]


tf_concat (TFConcat) (1, 40, 40, 256) 0 tf_upsample[0][0]
tf_c3_2[0][0]


tf_c3_4 (TFC3) (1, 40, 40, 128) 90496 tf_concat[0][0]


tf_conv_39 (TFConv) (1, 40, 40, 64) 8256 tf_c3_4[0][0]


tf_upsample_1 (TFUpsample) (1, 80, 80, 64) 0 tf_conv_39[0][0]


tf_concat_1 (TFConcat) (1, 80, 80, 128) 0 tf_upsample_1[0][0]
tf_c3_1[0][0]


tf_c3_5 (TFC3) (1, 80, 80, 64) 22720 tf_concat_1[0][0]


tf_conv_45 (TFConv) (1, 40, 40, 64) 36928 tf_c3_5[0][0]


tf_concat_2 (TFConcat) (1, 40, 40, 128) 0 tf_conv_45[0][0]
tf_conv_39[0][0]


tf_c3_6 (TFC3) (1, 40, 40, 128) 74112 tf_concat_2[0][0]


tf_conv_51 (TFConv) (1, 20, 20, 128) 147584 tf_c3_6[0][0]


tf_concat_3 (TFConcat) (1, 20, 20, 256) 0 tf_conv_51[0][0]
tf_conv_33[0][0]


tf_c3_7 (TFC3) (1, 20, 20, 256) 295680 tf_concat_3[0][0]


tf_detect (TFDetect) ((1, 25200, 9), [(1, 12177 tf_c3_5[0][0]
tf_c3_6[0][0]
tf_c3_7[0][0]

Total params: 1,764,577
Trainable params: 0
Non-trainable params: 1,764,577


2021-11-02 13:08:48.863933: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
Found untraced functions such as tf_conv_2_layer_call_and_return_conditional_losses, tf_conv_2_layer_call_fn, tf_conv_3_layer_call_and_return_conditional_losses, tf_conv_3_layer_call_fn, tf_conv_4_layer_call_and_return_conditional_losses while saving (showing 5 of 520). These functions will not be directly callable after loading.
Found untraced functions such as tf_conv_2_layer_call_and_return_conditional_losses, tf_conv_2_layer_call_fn, tf_conv_3_layer_call_and_return_conditional_losses, tf_conv_3_layer_call_fn, tf_conv_4_layer_call_and_return_conditional_losses while saving (showing 5 of 520). These functions will not be directly callable after loading.
Assets written to: export_tflite/yolov5n/best_saved_model/assets
TensorFlow saved_model: export success, saved as export_tflite/yolov5n/best_saved_model (62.2 MB)

TensorFlow Lite: starting export with tensorflow 2.4.1...
Found untraced functions such as tf_conv_2_layer_call_and_return_conditional_losses, tf_conv_2_layer_call_fn, tf_conv_3_layer_call_and_return_conditional_losses, tf_conv_3_layer_call_fn, tf_conv_4_layer_call_and_return_conditional_losses while saving (showing 5 of 520). These functions will not be directly callable after loading.
Found untraced functions such as tf_conv_2_layer_call_and_return_conditional_losses, tf_conv_2_layer_call_fn, tf_conv_3_layer_call_and_return_conditional_losses, tf_conv_3_layer_call_fn, tf_conv_4_layer_call_and_return_conditional_losses while saving (showing 5 of 520). These functions will not be directly callable after loading.
Assets written to: /tmp/tmpibgq2dv_/assets
2021-11-02 13:09:13.579602: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2021-11-02 13:09:13.579786: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2021-11-02 13:09:13.580027: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-11-02 13:09:13.605233: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 3300000000 Hz
2021-11-02 13:09:13.622811: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:928] Optimization results for grappler item: graph_to_optimize
function_optimizer: function_optimizer did nothing. time = 0.006ms.
function_optimizer: function_optimizer did nothing. time = 0ms.

2021-11-02 13:09:14.271508: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:316] Ignored output_format.
2021-11-02 13:09:14.271571: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:319] Ignored drop_control_dependency.
2021-11-02 13:09:14.311455: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set

TensorFlow Lite: export failure: too many values to unpack (expected 4)

Environment

  1. YOLOv5 🚀 v6.0-42-g4c0982a torch 1.9.0+cu102 CUDA:0 (Quadro RTX 4000, 7982.3125MB)
  2. YOLOv5 🚀 v6.0-42-g4c0982a torch 1.9.0+cu102 CPU
  3. Ubuntu 18.04.5 LTS
  4. python 3.8.8

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions