-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Description
Currently, "early stopping" monitors validation loss and stops after some unsucessful rounds. This is often used together with gridsearchCV to select a best model. Sometimes, the best performing model shows quite some overfit and one might prefer a model with slightly worse performance but less overfit, depending on the situation.
To actively control for overfit, I would love to see a modification of early stopping. It would stop the booster if after a couple rounds, the validation score is more than "overfit_tolerance" worse than the training score.
It could be used e.g. like this
callbacks=[lgb.early_stopping(20, overfit_tolerance=1.1)]
This would stop the boosting process if after 20 rounds, either the performance stopped improving or the ratio of validation to train performance became >1.1.