[Trainer] Move logic for checkpoint loading into separate methods for easy overriding #17043

calpt · 2022-05-02T13:08:05Z

What does this PR do?

This PR does a small refactoring in the Trainer class, specifically it moves the logic for the following two steps out of the training loop into separate helper methods:

loading a pre-existing checkpoint into the Trainer before the training starts is moved into the _load_from_checkpoint() method.
loading the best evaluated model checkpoint after training has completed is moved into the _load_best_model() method.

The PR does not change any existing logic in any way.

Motivation

In our library, we implement a custom Trainer class that subclasses your great built-in Trainer class. However, as we don't save full model checkpoints during training, the mentioned steps for checkpoint loading are not applicable to our use case. Moving this logic to separate methods would be super helpful to us (and potentially others), since we could easily override these helper methods without modifying the training loop itself.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sgugger

(cc @hSterz)

… easy overriding

HuggingFaceDocBuilderDev · 2022-05-02T13:23:10Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

Thanks for the PR! This will also make the code of the train method easier to read.

Note that you might have to keep up to date with some of the changes we do in those methods (I actually have a bug to fix to resume from sharded chekpoints that will make some changes in your two new private methods) when you subclass in your Trainer.

… easy overriding (huggingface#17043)

[Trainer] Move logic for checkpoint loading into separate methods for…

2954e6f

… easy overriding

calpt marked this pull request as ready for review May 2, 2022 13:18

sgugger approved these changes May 2, 2022

View reviewed changes

sgugger merged commit daecae1 into huggingface:main May 2, 2022

stevhliu pushed a commit to stevhliu/transformers that referenced this pull request May 3, 2022

[Trainer] Move logic for checkpoint loading into separate methods for…

c190e2b

… easy overriding (huggingface#17043)

elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022

[Trainer] Move logic for checkpoint loading into separate methods for…

de6a890

… easy overriding (huggingface#17043)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Trainer] Move logic for checkpoint loading into separate methods for easy overriding #17043

[Trainer] Move logic for checkpoint loading into separate methods for easy overriding #17043

Uh oh!

calpt commented May 2, 2022

Uh oh!

HuggingFaceDocBuilderDev commented May 2, 2022 •

edited

Loading

Uh oh!

sgugger left a comment

Uh oh!

Uh oh!

[Trainer] Move logic for checkpoint loading into separate methods for easy overriding #17043

[Trainer] Move logic for checkpoint loading into separate methods for easy overriding #17043

Uh oh!

Conversation

calpt commented May 2, 2022

What does this PR do?

Motivation

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented May 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 2, 2022 •

edited

Loading