-
-
Notifications
You must be signed in to change notification settings - Fork 17.3k
Description
🚀 Feature
Models are currently exported in FP32. Adding support for FP16 (half) export will allow for greater flexibility in model performance.
Motivation
To improve the performance of torchscript models.
Pitch
Ideally, this should be a very simple feature if I am not missing anything. I have integrated it on my local for exporting and I can make a pull request. Essentially, I added a "half" argument as follows:
parser.add_argument('--half', action='store_true', help='export with half precision')When the model is loaded, we do as follows:
if opt.half:
model = model.half()Then, when we create the empty image to run through the model, we do as follows:
if opt.half:
img = img.half()I have only tested this for torchscript and confirmed it is working. It improves the inference throughput by around 33% on a RTX 3090 with a batch size of 16 and image size of 320.
Please let me know what you think. Thank you for taking the time to read my issue :)