⏪ Parameterize `enable_prefix_caching` #2900

ji-huazhong · 2025-02-19T01:44:32Z

What does this PR do?

Tested using the configuration (for functional verification only) of open-r1 with minor modifications:

# Model arguments
model_name_or_path: Qwen/Qwen2.5-1.5B-Instruct
model_revision: main
torch_dtype: float16
attn_implementation: eager

# Data training arguments
dataset_name: open-r1/OpenR1-Math-220k
dataset_configs:
- default
system_prompt: "You are a helpful AI Assistant that provides well-reasoned and detailed responses. You first think about the reasoning process as an internal monologue and then provide the user with the answer. Respond in the following format: <think>\n...\n</think>\n<answer>\n...\n</answer>"

# GRPO trainer config
bf16: true
use_vllm: true
vllm_device: auto
vllm_dtype: half
vllm_gpu_memory_utilization: 0.7
vllm_enable_prefix_caching: true
do_eval: false
gradient_accumulation_steps: 4
gradient_checkpointing: true
gradient_checkpointing_kwargs:
  use_reentrant: false
# hub_model_id: Qwen2.5-1.5B-Open-R1-GRPO
# hub_strategy: every_save
learning_rate: 2.0e-05
log_completions: true
log_level: info
logging_first_step: false
logging_steps: 100
logging_strategy: steps
lr_scheduler_type: cosine
max_prompt_length: 512
max_completion_length: 1024
max_steps: -1
num_generations: 7
num_train_epochs: 1
output_dir: data/Qwen2.5-1.5B-Open-R1-GRPO
overwrite_output_dir: true
per_device_eval_batch_size: 1
per_device_train_batch_size: 1
push_to_hub: false
report_to:
- none
reward_funcs:
- accuracy
- format
reward_weights:
- 1.0
- 1.0
save_strategy: "epoch"
save_total_limit: 1
seed: 42
warmup_ratio: 0.1

with vllm_enable_prefix_cache: true (default):

with vllm_enable_prefix_cache: false:

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

cc @qgallouedec

trl/trainer/grpo_config.py

qgallouedec

Thanks for the contribution. Can you just please update the doc and we're good to merge

ji-huazhong · 2025-02-19T11:27:53Z

@qgallouedec Made the suggested change. :)

HuggingFaceDocBuilderDev · 2025-02-21T19:15:32Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec

Thanks!

* parameterize enable_prefix_caching * apply review suggestion --------- Co-authored-by: Quentin Gallouédec <[email protected]>

parameterize enable_prefix_caching

47f1e60

qgallouedec mentioned this pull request Feb 19, 2025

[GRPO] Disable prefix cache for models with sliding window #2866

Closed

5 tasks

qgallouedec reviewed Feb 19, 2025

View reviewed changes

trl/trainer/grpo_config.py Outdated Show resolved Hide resolved

qgallouedec reviewed Feb 19, 2025

View reviewed changes

ji-huazhong force-pushed the issue-2798 branch from 6492607 to c7d8a26 Compare February 19, 2025 11:20

apply review suggestion

416908b

ji-huazhong force-pushed the issue-2798 branch from c7d8a26 to 416908b Compare February 19, 2025 11:24

Merge branch 'main' into issue-2798

41edebd

qgallouedec approved these changes Feb 24, 2025

View reviewed changes

qgallouedec changed the title ~~parameterize enable_prefix_caching~~ ⏪ Parameterize enable_prefix_caching Feb 24, 2025

Merge branch 'main' into issue-2798

9ac2420

qgallouedec merged commit 69ad852 into huggingface:main Feb 24, 2025

kashif pushed a commit that referenced this pull request Feb 27, 2025

⏪ Parameterize enable_prefix_caching (#2900)

e7dcbfa

* parameterize enable_prefix_caching * apply review suggestion --------- Co-authored-by: Quentin Gallouédec <[email protected]>

jhinpan pushed a commit to jhinpan/trl-jin that referenced this pull request Mar 12, 2025

⏪ Parameterize enable_prefix_caching (huggingface#2900)

8be9a71

* parameterize enable_prefix_caching * apply review suggestion --------- Co-authored-by: Quentin Gallouédec <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⏪ Parameterize `enable_prefix_caching` #2900

⏪ Parameterize `enable_prefix_caching` #2900

ji-huazhong commented Feb 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

qgallouedec left a comment

Uh oh!

ji-huazhong commented Feb 19, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Feb 21, 2025

Uh oh!

qgallouedec left a comment

Uh oh!

Uh oh!

⏪ Parameterize enable_prefix_caching #2900

⏪ Parameterize enable_prefix_caching #2900

Conversation

ji-huazhong commented Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

ji-huazhong commented Feb 19, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Feb 21, 2025

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

⏪ Parameterize `enable_prefix_caching` #2900

⏪ Parameterize `enable_prefix_caching` #2900

ji-huazhong commented Feb 19, 2025 •

edited

Loading