-
-
Notifications
You must be signed in to change notification settings - Fork 17.2k
Description
Search before asking
- I have searched the YOLOv5 issues and found no similar bug report.
YOLOv5 Component
No response
Bug
Right now, when exporting a model to TensorRT using export.py
with the --half
argument, the input binding is set to datatype "half" while the output is set to datatype "float":
TensorRT: Network Description:
TensorRT: input "images" with shape (1, 3, 640, 640) and dtype DataType.HALF
TensorRT: output "output" with shape (1, 25200, 6) and dtype DataType.FLOAT
However, this means that whenever the resulting engine is used, its inputs also need to be converted to FP16, as can be seen in detect.py
:
Line 118 in cea994b
im = im.half() if half else im.float() # uint8 to fp16/32 |
Now I was wondering: is this intentional? TensorRT can handle an input in FP32 without problem even if the remaining engine is in half precision. However, having TensorRT handle the conversion means less overhead for the developer, while most likely also being faster as the conversion can be handled by the GPU instead of CPU. The datatype of the input binding when creating a new engine can be easily changed with a single line of code:
network.get_input(0).dtype = trt.DataType.FLOAT
Are you willing to submit a PR?
- Yes I'd like to help by submitting a PR!