-
-
Notifications
You must be signed in to change notification settings - Fork 17.2k
Description
Search before asking
- I have searched the YOLOv5 issues and found no similar bug report.
YOLOv5 Component
Export
Bug
1. export failed when use gpu device for --half option
python export.py --half --weights ./export_tflite/yolov5n/best.pt --include tflite --device 0
LOG
export: data=data/coco128.yaml, weights=./export_tflite/yolov5n/best.pt, imgsz=[640, 640], batch_size=1, device=0, half=True, inplace=False, train=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=13, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['tflite']
YOLOv5 🚀 v6.0-42-g4c0982a torch 1.9.0+cu102 CUDA:0 (Quadro RTX 4000, 7982.3125MB)Fusing layers...
Model Summary: 213 layers, 1764577 parameters, 0 gradients, 4.2 GFLOPsPyTorch: starting from export_tflite/yolov5n/best.pt (14.7 MB)
2021-11-02 13:03:19.661502: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0TensorFlow saved_model: starting export with tensorflow 2.4.1...
from n params module arguments
TensorFlow saved_model: export failure: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
TensorFlow Lite: starting export with tensorflow 2.4.1...
TensorFlow Lite: export failure: 'NoneType' object has no attribute 'call'
2. export failed when use --int8 option
python export.py --int8 --weights ./export_tflite/yolov5n/best.pt --include tflite
LOG
export: data=data/coco128.yaml, weights=./export_tflite/yolov5n/best.pt, imgsz=[640, 640], batch_size=1, device=cpu, half=False, inplace=False, train=False, optimize=False, int8=True, dynamic=False, simplify=False, opset=13, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['tflite']
YOLOv5 🚀 v6.0-42-g4c0982a torch 1.9.0+cu102 CPUFusing layers...
Model Summary: 213 layers, 1764577 parameters, 0 gradients, 4.2 GFLOPsPyTorch: starting from export_tflite/yolov5n/best.pt (14.7 MB)
2021-11-02 13:08:43.302145: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0TensorFlow saved_model: starting export with tensorflow 2.4.1...
from n params module arguments
2021-11-02 13:08:44.301976: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-11-02 13:08:44.303109: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-11-02 13:08:44.334274: E tensorflow/stream_executor/cuda/cuda_driver.cc:328] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2021-11-02 13:08:44.334350: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: 19367d16ac94
2021-11-02 13:08:44.334358: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: 19367d16ac94
2021-11-02 13:08:44.335550: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 470.57.2
2021-11-02 13:08:44.335585: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 470.57.2
2021-11-02 13:08:44.335591: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:310] kernel version seems to match DSO: 470.57.2
2021-11-02 13:08:44.336461: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-11-02 13:08:44.338266: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
0 -1 1 1760 models.common.Conv [3, 16, 6, 2, 2]
1 -1 1 4672 models.common.Conv [16, 32, 3, 2]
2 -1 1 4800 models.common.C3 [32, 32, 1]
3 -1 1 18560 models.common.Conv [32, 64, 3, 2]
4 -1 1 29184 models.common.C3 [64, 64, 2]
5 -1 1 73984 models.common.Conv [64, 128, 3, 2]
6 -1 1 156928 models.common.C3 [128, 128, 3]
7 -1 1 295424 models.common.Conv [128, 256, 3, 2]
8 -1 1 296448 models.common.C3 [256, 256, 1]
9 -1 1 164608 models.common.SPPF [256, 256, 5]
10 -1 1 33024 models.common.Conv [256, 128, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 90880 models.common.C3 [256, 128, 1, False]
14 -1 1 8320 models.common.Conv [128, 64, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 22912 models.common.C3 [128, 64, 1, False]
18 -1 1 36992 models.common.Conv [64, 64, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 74496 models.common.C3 [128, 128, 1, False]
21 -1 1 147712 models.common.Conv [128, 128, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 296448 models.common.C3 [256, 256, 1, False]
24 [17, 20, 23] 1 12177 models.yolo.Detect [4, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [64, 128, 256], [640, 640]]
Model: "model"
Layer (type) Output Shape Param # Connected to
input_1 (InputLayer) [(1, 640, 640, 3)] 0
tf_conv (TFConv) (1, 320, 320, 16) 1744 input_1[0][0]
tf_conv_1 (TFConv) (1, 160, 160, 32) 4640 tf_conv[0][0]
tf_c3 (TFC3) (1, 160, 160, 32) 4704 tf_conv_1[0][0]
tf_conv_7 (TFConv) (1, 80, 80, 64) 18496 tf_c3[0][0]
tf_c3_1 (TFC3) (1, 80, 80, 64) 28928 tf_conv_7[0][0]
tf_conv_15 (TFConv) (1, 40, 40, 128) 73856 tf_c3_1[0][0]
tf_c3_2 (TFC3) (1, 40, 40, 128) 156288 tf_conv_15[0][0]
tf_conv_25 (TFConv) (1, 20, 20, 256) 295168 tf_c3_2[0][0]
tf_c3_3 (TFC3) (1, 20, 20, 256) 295680 tf_conv_25[0][0]
tfsppf (TFSPPF) (1, 20, 20, 256) 164224 tf_c3_3[0][0]
tf_conv_33 (TFConv) (1, 20, 20, 128) 32896 tfsppf[0][0]
tf_upsample (TFUpsample) (1, 40, 40, 128) 0 tf_conv_33[0][0]
tf_concat (TFConcat) (1, 40, 40, 256) 0 tf_upsample[0][0]
tf_c3_2[0][0]
tf_c3_4 (TFC3) (1, 40, 40, 128) 90496 tf_concat[0][0]
tf_conv_39 (TFConv) (1, 40, 40, 64) 8256 tf_c3_4[0][0]
tf_upsample_1 (TFUpsample) (1, 80, 80, 64) 0 tf_conv_39[0][0]
tf_concat_1 (TFConcat) (1, 80, 80, 128) 0 tf_upsample_1[0][0]
tf_c3_1[0][0]
tf_c3_5 (TFC3) (1, 80, 80, 64) 22720 tf_concat_1[0][0]
tf_conv_45 (TFConv) (1, 40, 40, 64) 36928 tf_c3_5[0][0]
tf_concat_2 (TFConcat) (1, 40, 40, 128) 0 tf_conv_45[0][0]
tf_conv_39[0][0]
tf_c3_6 (TFC3) (1, 40, 40, 128) 74112 tf_concat_2[0][0]
tf_conv_51 (TFConv) (1, 20, 20, 128) 147584 tf_c3_6[0][0]
tf_concat_3 (TFConcat) (1, 20, 20, 256) 0 tf_conv_51[0][0]
tf_conv_33[0][0]
tf_c3_7 (TFC3) (1, 20, 20, 256) 295680 tf_concat_3[0][0]
tf_detect (TFDetect) ((1, 25200, 9), [(1, 12177 tf_c3_5[0][0]
tf_c3_6[0][0]
tf_c3_7[0][0]Total params: 1,764,577
Trainable params: 0
Non-trainable params: 1,764,577
2021-11-02 13:08:48.863933: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
Found untraced functions such as tf_conv_2_layer_call_and_return_conditional_losses, tf_conv_2_layer_call_fn, tf_conv_3_layer_call_and_return_conditional_losses, tf_conv_3_layer_call_fn, tf_conv_4_layer_call_and_return_conditional_losses while saving (showing 5 of 520). These functions will not be directly callable after loading.
Found untraced functions such as tf_conv_2_layer_call_and_return_conditional_losses, tf_conv_2_layer_call_fn, tf_conv_3_layer_call_and_return_conditional_losses, tf_conv_3_layer_call_fn, tf_conv_4_layer_call_and_return_conditional_losses while saving (showing 5 of 520). These functions will not be directly callable after loading.
Assets written to: export_tflite/yolov5n/best_saved_model/assets
TensorFlow saved_model: export success, saved as export_tflite/yolov5n/best_saved_model (62.2 MB)TensorFlow Lite: starting export with tensorflow 2.4.1...
Found untraced functions such as tf_conv_2_layer_call_and_return_conditional_losses, tf_conv_2_layer_call_fn, tf_conv_3_layer_call_and_return_conditional_losses, tf_conv_3_layer_call_fn, tf_conv_4_layer_call_and_return_conditional_losses while saving (showing 5 of 520). These functions will not be directly callable after loading.
Found untraced functions such as tf_conv_2_layer_call_and_return_conditional_losses, tf_conv_2_layer_call_fn, tf_conv_3_layer_call_and_return_conditional_losses, tf_conv_3_layer_call_fn, tf_conv_4_layer_call_and_return_conditional_losses while saving (showing 5 of 520). These functions will not be directly callable after loading.
Assets written to: /tmp/tmpibgq2dv_/assets
2021-11-02 13:09:13.579602: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2021-11-02 13:09:13.579786: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2021-11-02 13:09:13.580027: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-11-02 13:09:13.605233: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 3300000000 Hz
2021-11-02 13:09:13.622811: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:928] Optimization results for grappler item: graph_to_optimize
function_optimizer: function_optimizer did nothing. time = 0.006ms.
function_optimizer: function_optimizer did nothing. time = 0ms.2021-11-02 13:09:14.271508: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:316] Ignored output_format.
2021-11-02 13:09:14.271571: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:319] Ignored drop_control_dependency.
2021-11-02 13:09:14.311455: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not setTensorFlow Lite: export failure: too many values to unpack (expected 4)
Environment
- YOLOv5 🚀 v6.0-42-g4c0982a torch 1.9.0+cu102 CUDA:0 (Quadro RTX 4000, 7982.3125MB)
- YOLOv5 🚀 v6.0-42-g4c0982a torch 1.9.0+cu102 CPU
- Ubuntu 18.04.5 LTS
- python 3.8.8
Minimal Reproducible Example
No response
Additional
No response
Are you willing to submit a PR?
- Yes I'd like to help by submitting a PR!