🎲 [GRPO] Make training dataset shuffle optional #3334

LeonEricsson · 2025-04-21T17:01:57Z

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

tests/test_grpo_trainer.py

trl/trainer/grpo_trainer.py

tests/test_grpo_trainer.py

qgallouedec

Just some minor things, and I think we're mostly good!

trl/trainer/grpo_trainer.py

qgallouedec

lgtm! thanks!

HuggingFaceDocBuilderDev · 2025-04-21T20:26:22Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

LeonEricsson added 2 commits April 21, 2025 18:26

option to disable dataset shuffle

a35ac01

wip test cases

ae77f30

qgallouedec reviewed Apr 21, 2025

View reviewed changes

tests/test_grpo_trainer.py Outdated Show resolved Hide resolved

qgallouedec reviewed Apr 21, 2025

View reviewed changes

tests/test_grpo_trainer.py Outdated Show resolved Hide resolved

qgallouedec reviewed Apr 21, 2025

View reviewed changes

tests/test_grpo_trainer.py Outdated Show resolved Hide resolved

qgallouedec reviewed Apr 21, 2025

View reviewed changes

trl/trainer/grpo_trainer.py Show resolved Hide resolved

LeonEricsson and others added 2 commits April 21, 2025 20:34

wip test cases

24b2b6d

simplified sampler test cases

0edab92

LeonEricsson commented Apr 21, 2025

View reviewed changes

tests/test_grpo_trainer.py Outdated Show resolved Hide resolved

LeonEricsson marked this pull request as ready for review April 21, 2025 19:07

deprecation warning for RepeatRandomSampler

39a34f5

qgallouedec reviewed Apr 21, 2025

View reviewed changes

tests/test_grpo_trainer.py Show resolved Hide resolved

qgallouedec reviewed Apr 21, 2025

View reviewed changes

trl/trainer/grpo_trainer.py Outdated Show resolved Hide resolved

revert test cases, keeping one no shuffle test

01a2f5e

qgallouedec approved these changes Apr 21, 2025

View reviewed changes

style

f43d926

qgallouedec linked an issue Apr 21, 2025 that may be closed by this pull request

How can I set the dataset to not shuffle? It seems there is no such option. #3333

Closed

qgallouedec changed the title ~~[GRPO] Make training dataset shuffle optional~~ 🎲 [GRPO] Make training dataset shuffle optional Apr 21, 2025

qgallouedec merged commit 0dad4eb into huggingface:main Apr 21, 2025
9 checks passed

sidmadala mentioned this pull request Apr 21, 2025

AxolotlGRPOTrainer still shuffles combined datasets even with curriculum_sampling flag enabled axolotl-ai-cloud/axolotl#2376

Open

8 tasks

hjh0119 mentioned this pull request Apr 23, 2025

updates GRPOTrainer compatible with trl 0.17 modelscope/ms-swift#3969

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🎲 [GRPO] Make training dataset shuffle optional #3334

🎲 [GRPO] Make training dataset shuffle optional #3334

Uh oh!

LeonEricsson commented Apr 21, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec left a comment

Uh oh!

Uh oh!

qgallouedec left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 21, 2025

Uh oh!

Uh oh!

Uh oh!

🎲 [GRPO] Make training dataset shuffle optional #3334

🎲 [GRPO] Make training dataset shuffle optional #3334

Uh oh!

Conversation

LeonEricsson commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Before submitting

Who can review?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 21, 2025

Uh oh!

Uh oh!

Uh oh!

LeonEricsson commented Apr 21, 2025 •

edited

Loading