Skip to content

⏪ Parameterize enable_prefix_caching #2900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 24, 2025

Conversation

ji-huazhong
Copy link
Contributor

@ji-huazhong ji-huazhong commented Feb 19, 2025

What does this PR do?

Fixes #2798

Tested using the configuration (for functional verification only) of open-r1 with minor modifications:

# Model arguments
model_name_or_path: Qwen/Qwen2.5-1.5B-Instruct
model_revision: main
torch_dtype: float16
attn_implementation: eager

# Data training arguments
dataset_name: open-r1/OpenR1-Math-220k
dataset_configs:
- default
system_prompt: "You are a helpful AI Assistant that provides well-reasoned and detailed responses. You first think about the reasoning process as an internal monologue and then provide the user with the answer. Respond in the following format: <think>\n...\n</think>\n<answer>\n...\n</answer>"

# GRPO trainer config
bf16: true
use_vllm: true
vllm_device: auto
vllm_dtype: half
vllm_gpu_memory_utilization: 0.7
vllm_enable_prefix_caching: true
do_eval: false
gradient_accumulation_steps: 4
gradient_checkpointing: true
gradient_checkpointing_kwargs:
  use_reentrant: false
# hub_model_id: Qwen2.5-1.5B-Open-R1-GRPO
# hub_strategy: every_save
learning_rate: 2.0e-05
log_completions: true
log_level: info
logging_first_step: false
logging_steps: 100
logging_strategy: steps
lr_scheduler_type: cosine
max_prompt_length: 512
max_completion_length: 1024
max_steps: -1
num_generations: 7
num_train_epochs: 1
output_dir: data/Qwen2.5-1.5B-Open-R1-GRPO
overwrite_output_dir: true
per_device_eval_batch_size: 1
per_device_train_batch_size: 1
push_to_hub: false
report_to:
- none
reward_funcs:
- accuracy
- format
reward_weights:
- 1.0
- 1.0
save_strategy: "epoch"
save_total_limit: 1
seed: 42
warmup_ratio: 0.1

with vllm_enable_prefix_cache: true (default):
image

with vllm_enable_prefix_cache: false:

image

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a GitHub issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

cc @qgallouedec

Copy link
Member

@qgallouedec qgallouedec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. Can you just please update the doc and we're good to merge

@ji-huazhong
Copy link
Contributor Author

@qgallouedec Made the suggested change. :)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@qgallouedec qgallouedec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@qgallouedec qgallouedec changed the title parameterize enable_prefix_caching ⏪ Parameterize enable_prefix_caching Feb 24, 2025
@qgallouedec qgallouedec merged commit 69ad852 into huggingface:main Feb 24, 2025
kashif pushed a commit that referenced this pull request Feb 27, 2025
* parameterize enable_prefix_caching

* apply review suggestion

---------

Co-authored-by: Quentin Gallouédec <[email protected]>
jhinpan pushed a commit to jhinpan/trl-jin that referenced this pull request Mar 12, 2025
* parameterize enable_prefix_caching

* apply review suggestion

---------

Co-authored-by: Quentin Gallouédec <[email protected]>
yxliu-TAMU pushed a commit to mincheolseong/ECEN743-GRPO-Project-Proposal that referenced this pull request Apr 20, 2025
* parameterize enable_prefix_caching

* apply review suggestion

---------

Co-authored-by: Quentin Gallouédec <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error when using use_vllm=True with GRPOTrainer on V100 GPUs
3 participants