Refactor RL model wrapper into a `trainer` module #144

jon-tow · 2022-12-20T03:29:25Z

This PR refactors the RL model wrappers into "trainer" wrappers. The term "model" has semantic overloading throughout the codebase. A specific point of confusion is the {Type}RLModels, which do not only contain models but also wrap around optimizers, schedulers, and other auxiliary data structures required for RL training.

This refactor was briefly discussed in the CarperAI Discord with @cat-state. I am leaving it here as a reminder and for others to chime in.

LouisCastricato · 2022-12-20T14:22:02Z

I am strongly in favor of this refactor

examples/architext.py

tests/test_configs.py

trlx/trainer/__init__.py

Dahoas

Looks good! Will merge if there are no further changes

Dahoas · 2022-12-21T13:51:19Z

examples/experiments/grounded_program_synthesis/train_trlx.py

        config=config,
    )
-    model.save_pretrained("dataset/trained_model")
+    trainer.save_pretrained("dataset/trained_model")


Do the model types we use support save_pretrained?

Wait I don't think they do, or at least ppo doesn't. The base ppo model is just an nn.Module (not pretrained). It seems actually very annoying to save new model architectures in a huggingface format. We'll probably have to write a new config.

Ohhhh ok wait that's weird then can we just add a save pretrained function to PPO haha

But anyway saving doesn't have anything to do with this pr so I think it's fine for now.

…l-model-trainer

Refactor RL model wrapper into a trainer module

b0e7c95

Update docstrings with trainer references

3140412

LouisCastricato approved these changes Dec 20, 2022

View reviewed changes

examples/architext.py Show resolved Hide resolved

tests/test_configs.py Show resolved Hide resolved

trlx/trainer/__init__.py Show resolved Hide resolved

jon-tow requested a review from Dahoas December 20, 2022 21:19

Dahoas approved these changes Dec 21, 2022

View reviewed changes

Merge branch 'main' of https://github.com/CarperAI/trlx into rename-r…

3a53dd7

…l-model-trainer

LouisCastricato merged commit 1d0b904 into CarperAI:main Dec 21, 2022

jon-tow deleted the rename-rl-model-trainer branch December 21, 2022 18:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor RL model wrapper into a `trainer` module #144

Refactor RL model wrapper into a `trainer` module #144

Uh oh!

jon-tow commented Dec 20, 2022

Uh oh!

LouisCastricato commented Dec 20, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Dahoas left a comment

Uh oh!

Dahoas Dec 21, 2022

Uh oh!

LouisCastricato Dec 21, 2022

Uh oh!

Dahoas Dec 21, 2022 •

edited

Loading

Uh oh!

LouisCastricato Dec 21, 2022

Uh oh!

Dahoas Dec 21, 2022

Uh oh!

Uh oh!

Refactor RL model wrapper into a trainer module #144

Refactor RL model wrapper into a trainer module #144

Uh oh!

Conversation

jon-tow commented Dec 20, 2022

Uh oh!

LouisCastricato commented Dec 20, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Dahoas left a comment

Choose a reason for hiding this comment

Uh oh!

Dahoas Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

LouisCastricato Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

Dahoas Dec 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LouisCastricato Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

Dahoas Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Refactor RL model wrapper into a `trainer` module #144

Refactor RL model wrapper into a `trainer` module #144

Dahoas Dec 21, 2022 •

edited

Loading