Skip to content

🌟💡 YOLOv5 Study: batch size #2377

@glenn-jocher

Description

@glenn-jocher

Study 🤔

I did a quick study to examine the effect of varying batch size on YOLOv5 trainings. The study trained YOLOv5s on COCO for 300 epochs with --batch-size at 8 different values: [16, 20, 32, 40, 64, 80, 96, 128].

We've tried to make the train code batch-size agnostic, so that users get similar results at any batch size. This means users on a 11 GB 2080 Ti should be able to produce the same results as users on a 24 GB 3090 or a 40 GB A100, with smaller GPUs simply using smaller batch sizes.

We do this by scaling loss with batch size, and also by scaling weight decay with batch size. At batch sizes smaller than 64 we accumulate loss before optimizing, and at batch sizes above 64 we optimize after every batch.

Results 😃

Initial results vary significantly with batch size, but final results are nearly identical (good!).
Screen Shot 2021-03-05 at 1 22 03 PM

Closeup of [email protected]:0.95:
Screen Shot 2021-03-05 at 1 27 33 PM

One oddity that stood out is val objectness loss, which did vary with batch-size. I'm not sure why, as val-box and val-cls did not vary much, and neither did the 3 train losses. I don't know what this means or if there's any room for concern (or improvement).
Screen Shot 2021-03-05 at 1 27 21 PM

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions