Skip to content

Conversation

jon-tow
Copy link
Collaborator

@jon-tow jon-tow commented Dec 20, 2022

This PR refactors the RL model wrappers into "trainer" wrappers. The term "model" has semantic overloading throughout the codebase. A specific point of confusion is the {Type}RLModels, which do not only contain models but also wrap around optimizers, schedulers, and other auxiliary data structures required for RL training.

This refactor was briefly discussed in the CarperAI Discord with @cat-state. I am leaving it here as a reminder and for others to chime in.

@LouisCastricato
Copy link
Contributor

I am strongly in favor of this refactor

@jon-tow jon-tow requested a review from Dahoas December 20, 2022 21:19
Copy link
Collaborator

@Dahoas Dahoas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Will merge if there are no further changes

config=config,
)
model.save_pretrained("dataset/trained_model")
trainer.save_pretrained("dataset/trained_model")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the model types we use support save_pretrained?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Collaborator

@Dahoas Dahoas Dec 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait I don't think they do, or at least ppo doesn't. The base ppo model is just an nn.Module (not pretrained). It seems actually very annoying to save new model architectures in a huggingface format. We'll probably have to write a new config.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhhh ok wait that's weird then can we just add a save pretrained function to PPO haha

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But anyway saving doesn't have anything to do with this pr so I think it's fine for now.

@LouisCastricato LouisCastricato merged commit 1d0b904 into CarperAI:main Dec 21, 2022
@jon-tow jon-tow deleted the rename-rl-model-trainer branch December 21, 2022 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants