About using vLLM integration as general generation tool for custom training loops #3623
daniel-dona
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
A lot of work have been put to integrate GRPO with vLLM as a way to speed up online inferencing during training, could be great to generalize the integration in a way it could be used in custom training loops.
I'm thinking for example as a way of completions sampling during training, or performing custom evaluations that require completions.
Beta Was this translation helpful? Give feedback.
All reactions