Skip to content

Exception after training on Docker image e88040337fa3 #8764

@sstainba

Description

@sstainba

Search before asking

  • I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

Training

Bug

After training two different custom models, both runs end with several exceptions thrown:

wandb: Synced 5 W&B file(s), 111 media file(s), 1 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20220728_115228-25fqw5ps/logs
Exception ignored in: <function StorageWeakRef.__del__ at 0x7fc14e083040>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 38, in __del__
  File "/opt/conda/lib/python3.8/site-packages/torch/storage.py", line 636, in _free_weak_ref
AttributeError: 'NoneType' object has no attribute '_free_weak_ref'

This is exception is repeated about 20 times.

Environment

Docker image e88040337fa3
Windows 11 Host using WSL2
Nvidia 3060 12GB GPU

Minimal Reproducible Example

Doesn't appear to be specific to my work. This happened with two different data sets.

Dataset 1: 300 images 1280 x1024, 1 Tag/Class
Command: train.py --rect --imgsz 1280 --img 1024 --epochs 500 --name test1 --cache --batch 4 --data /usr/src/datasets/data.yaml --weights yolov5m6.pt

Dataset 2: 200 images 416 x 416, 1 Tag/Class
Command: train.py --img 416 --epochs 500 --name splash --cache --batch 8 --data /usr/src/datasets/splash/data.yaml --weights yolov5s.pt

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions