Skip to content

Early stopping: overfit prevention #4996

@mayer79

Description

@mayer79

Currently, "early stopping" monitors validation loss and stops after some unsucessful rounds. This is often used together with gridsearchCV to select a best model. Sometimes, the best performing model shows quite some overfit and one might prefer a model with slightly worse performance but less overfit, depending on the situation.

To actively control for overfit, I would love to see a modification of early stopping. It would stop the booster if after a couple rounds, the validation score is more than "overfit_tolerance" worse than the training score.

It could be used e.g. like this

callbacks=[lgb.early_stopping(20, overfit_tolerance=1.1)]

This would stop the boosting process if after 20 rounds, either the performance stopped improving or the ratio of validation to train performance became >1.1.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions