We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent e603155 commit c1ca9c1Copy full SHA for c1ca9c1
verl/trainer/config/algorithm.py
@@ -48,7 +48,7 @@ class FilterGroupsConfig(BaseConfig):
48
Args:
49
enable (bool): Whether to enable filter groups.
50
NOTE: This feature is not compatible with asynchronous reward computation (`launch_reward_fn_async=True`).
51
- metric (Optional[str]): Metric to use for filtering: "acc", "score", "seq_reward", "seq_final_reward", etc.
+ metric (Optional[str]): Metric to use for filtering: "acc", "score", "seq_reward", etc.
52
max_num_gen_batches (int): Maximum number of backfill attempts when collecting diverse responses.
53
Non-positive values mean no upper limit (use with caution).
54
filter_function (Optional[str]): Path to filter function (e.g., "my_module.my_filter_func").
0 commit comments