Skip to content

Conversation

glenn-jocher
Copy link
Member

@glenn-jocher glenn-jocher commented Aug 1, 2022

Observed erratic training behavior (green line) with the nan_to_num hook (introduced in #8598) in classifier branch. I'm going to remove it from master.

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Improvement to the training stability by adjusting layer freeze behavior in the YOLOv5 model.

📊 Key Changes

  • The hook that converts NaN (not a number) values to zero during training has been commented out.

🎯 Purpose & Impact

  • Purpose: To address and possibly resolve erratic training results that may be caused by the previous method of handling NaN values.
  • Impact: Users should experience more stable and reliable training, although they might need to monitor the training process for NaN values which are no longer being automatically converted to zeros.

Observed erratic training behavior (green line) with the nan_to_num hook in classifier branch. I'm going to remove it from master.
@glenn-jocher glenn-jocher linked an issue Aug 1, 2022 that may be closed by this pull request
2 tasks
@glenn-jocher glenn-jocher merged commit f3c78a3 into master Aug 1, 2022
@glenn-jocher glenn-jocher deleted the glenn-jocher-patch-1 branch August 1, 2022 19:39
ctjanuhowski pushed a commit to ctjanuhowski/yolov5 that referenced this pull request Sep 8, 2022
* Remove hook `torch.nan_to_num(x)`

Observed erratic training behavior (green line) with the nan_to_num hook in classifier branch. I'm going to remove it from master.

* Update train.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NaNs and INFs in gradient values

1 participant