Create a unified API between PPO and ILQL

Currently ILQL was implemented more or less in a vacuum of our PPO implementation. As such, ILQL has features that our PPO implementation needs. This includes

- Removing the dependency on GPT2 for PPO. This is currently being done [here](https://github.com/samp830/trlx), but we can create a more standardized approach
- Adding the ability to pass custom reward functions to the orchestrator. This is an absolute must have.
- Standardized naming conventions between functions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create a unified API between PPO and ILQL #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Create a unified API between PPO and ILQL #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions