Skip to content

Commit 9603b41

Browse files
authored
😷 Refactor GRPO/RLOO to isolate _generate (huggingface#4114)
1 parent 5ee56ed commit 9603b41

File tree

4 files changed

+193
-462
lines changed

4 files changed

+193
-462
lines changed

docs/source/experimental.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ class GroupFilter:
6666
return group_scores
6767

6868
training_args = GFPOConfig(
69-
output_dir="Qwen3-0.6B-GFPO"
69+
output_dir="Qwen3-0.6B-GFPO",
7070
per_device_train_batch_size=4,
7171
num_remains_in_group=2,
7272
bf16=True,

0 commit comments

Comments
 (0)