-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Hi @stark-t , I run two identical nano models on the Clara cluster and the results are a bit different.
Below you can look at the confusion matrices on the validation dataset. You also find the results.csv for each run at the bottom of this comment.
I personally do not like to see those differences in the two identical nano runs (but I can learn to accept it :D ). Not sure how to set a seed for yolov5 so that two runs of the same model are identical, or if that is even possible with the current configuration. Sadly we didn't see yet any parameters implemented with argparse
to take a seed. There is a discussion here ultralytics/yolov5#1222 pointing at PyTorch reproducibility issue https://pytorch.org/docs/stable/notes/randomness.html
The main takes are:
Completely reproducible results are not guaranteed across PyTorch releases, individual commits, or different platforms. Furthermore, results may not be reproducible between CPU and GPU executions, even when using identical seeds.
The only method I'm aware of that might guarantee you identical results might be to train on CPU with --workers 0, but this is impractical naturally, so you simply need to adapt your workflow to accommodate minor variations in final model results.
nano model n1
nano model n2
small model s
Results csv files
nano model n1: results.csv
nano model n2: results.csv
small model s: results.csv