-
Couldn't load subscription status.
- Fork 2.4k
[sglang] feat: Add SGLang async multi-turn rollout with tool support #1037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
eric-haibin-lin
merged 120 commits into
volcengine:main
from
SwordFaith:feat/add_async_sglang_multi_turn_support
Apr 29, 2025
Merged
Changes from all commits
Commits
Show all changes
120 commits
Select commit
Hold shift + click to select a range
d7a70a5
Add new designed data model for tool and async rollout
SwordFaith dc578c1
add AsyncRolloutRequest and message-ids convert tools
zyzshishui 0edc910
Add unittest to async sgl engine and related ci
SwordFaith 59df7ab
Fix test case and add verl engine with async
SwordFaith 63caefe
Fix AsyncVerlEngine naming
SwordFaith 2eea830
Dump test async sgl rollout checkpoint
SwordFaith 4199cf8
Fix async rollout with memory saver issue
SwordFaith ed46763
Tmp dump for pass messages by raw_prompt key in non_tensor_batch
SwordFaith bc4fe76
Add example gsm8k tool and fix schema issue
SwordFaith d2522f0
Tmp dump for add prompt to async rollout request and async rollout st…
SwordFaith e0b3bbb
Wait to add more test for async gen with tools and verify training se…
SwordFaith 4532d33
Add support for tool_calls formating and parse error
SwordFaith c415ad4
support loss_mask and loading tool from config
zyzshishui 6a91862
fix review
zyzshishui c3831f2
fix review
zyzshishui 7c063a9
Fix get_tool_call_parser_type bot/eot check logic
SwordFaith 3cbf99b
Convert tool method and fn call to all async
SwordFaith c730d48
merge and add rst file for multiturn
zyzshishui 983c28e
fixed tool initialization
zyzshishui 2cb290c
Add unittest with sharding manager and rollout, and fix async generat…
SwordFaith 8d18e43
wait to support batch rollout
zyzshishui a88ddda
add max_turn
zyzshishui bdfd998
add e2e test for tool calling
zyzshishui 26ade88
fix batch rollout
zyzshishui c437093
Merge branch 'feat/add_async_sglang_multi_turn_support' into feat/los…
SwordFaith 8e7b00b
Merge pull request #2 from zyzshishui/feat/loss_mask_and_tool_config
SwordFaith fd3b44c
Dump first runable version
SwordFaith 1ff236c
Fix config missing in dump
SwordFaith 77f0429
Try use mb cluster training
SwordFaith c0d710d
Fix script pip install bug
SwordFaith 0e627d7
Disable debug data batch keys log
SwordFaith 1a803ea
Refined to correctly run grpo in a stable way
SwordFaith b215f02
Fix ckpt path and wandb dir
SwordFaith 9c11a1e
Add update torch-memory-saver and wandb
SwordFaith 6062a8c
Add new scripts for verify
SwordFaith bc6043b
Fix config bugs
SwordFaith 9abf201
Try fix import sgl when using vllm only error
SwordFaith 28d6268
Try increase n=16
SwordFaith a556981
Add n=64 setting to increasing rate of convergence
SwordFaith 6a479cc
Add temperature 1.0 support
SwordFaith a29fac3
Update config
SwordFaith df51df2
Dump debug code
SwordFaith 9b50d00
Add verification for single turn generation
SwordFaith f20aed7
Fix bug
SwordFaith 67b15e6
Add 3k max resp script
SwordFaith 15ff9c5
Use patch with aligned cli args
SwordFaith 5a372b0
Fix arg
SwordFaith 439d816
Add new version data script
SwordFaith 8ce6689
Try if it's padding issue
SwordFaith f60ac69
Add pad test script
SwordFaith 679c324
Fix dtype in pad_sequence_to_length issue
SwordFaith 07e25e6
Add new training script
SwordFaith ddb2337
Add pad print
SwordFaith 25e7814
Add debug log
SwordFaith 6f6f94e
Try refine debug logs
SwordFaith 185cf74
Add n16 train script
SwordFaith c097cb1
Add loss mask test script
SwordFaith e839465
Add debug info to pad and unpad
SwordFaith 560e2ca
Add schema update and format, need_tool_kwargs fix
SwordFaith b939760
try to change to relative path
WANG-GH 0719d1f
fix abs path
WANG-GH 9f17ace
Remove verl engine with async and add modelbest inc as co-org
SwordFaith 6ecb8d4
fix conlficts in RLHF dataset
zhaochenyang20 80e3c2c
Merge branch 'main' of github.com:SwordFaith/verl
zhaochenyang20 0729b07
Merge branch 'main' into feat/renewed_multi_turn_pr_branch
zhaochenyang20 caf1a08
resolve not our code
WANG-GH a7ffe12
fix format v2
WANG-GH 20f5696
Fix example run issue
SwordFaith 6cfce61
Add environ SGL_DISABLE_TP_MEMORY_INBALANCE_CHECK=True
SwordFaith 6e7017d
fix lint v1
WANG-GH 4ea91e3
fix lint v2
WANG-GH 80be127
Fix dapo ci
SwordFaith 67f218f
Fix sanity check ci
SwordFaith 092b7b9
Add back reinforce plus plus base line
SwordFaith 2066bae
Fix file naming and dir structure issue
SwordFaith f2e8a6b
Fix license str
SwordFaith 9efb2be
Fix req_list append missing
SwordFaith a195328
Sync generate sequences with sglang_rollout
SwordFaith 897edf4
stop before enter fsdp_async_manager
WANG-GH 10dff84
Add config validation assertion
SwordFaith e19fcaa
Refactor async sglang sharding manager
SwordFaith 377bf06
repeat tool_kwargs & upgrade ppo_megatron_trainer
ocss884 b6fa35b
Fix e2e test too long
SwordFaith 9805be8
refact test v1
WANG-GH 3aebb8c
unfinish
WANG-GH 712c2ef
refact test v3
WANG-GH a7236b8
refact test v4
WANG-GH 5569eaf
Fix import path error and update ci config
SwordFaith 48ad489
Merge branch 'main' into feat/add_async_sglang_multi_turn_support
tongyx361 9e66aec
fix: pre-commit
tongyx361 b078914
fix: license
tongyx361 508d382
fix: tools_kwargs
tongyx361 eb4f1cb
fix: .mypy_cache
tongyx361 5886337
Merge pull request #10 from tongyx361/tyx/fix/ci
tongyx361 f79d706
update async test
ocss884 a8eb256
pre-commit
ocss884 7020582
fix
ocss884 3468b4c
Fix lint and add mutiturn and sglang_async to e2e ppo trainer
SwordFaith 70b7169
Fix broadcast_pyobj args
SwordFaith 3501c46
fix sgl.yml
ocss884 828c05b
feat: sgl CI manual trigger
tongyx361 a8d6142
Fix nproc issue in sgl.yml
SwordFaith 6b8552f
fix sgl.yml
ocss884 029d397
more
ocss884 db781c0
fix wandb
eric-haibin-lin 2153056
Fix cpu backend missing bug
SwordFaith 83a735d
Add sharding manager to unit test
SwordFaith 270b7b0
Seperate e2e ppo trainer sglang task
SwordFaith 040237e
Avoid validation and lower test time
SwordFaith 3e21b08
feat: secrets.HF_ENDPOINT
tongyx361 1105c3e
fix: secrets.HF_ENDPOINT
tongyx361 766a082
feat: https://hf-mirror.com
tongyx361 9912290
temp: only test ppo_trainer for sglang
tongyx361 6ee96e1
Revert "temp: only test ppo_trainer for sglang"
tongyx361 ef7d527
temp: SGLang label
tongyx361 52a7ee6
Revert "temp: SGLang label"
tongyx361 1d4d648
temp: SGLang label
tongyx361 05413e3
feat: no_proxy for hf-mirror.com
tongyx361 6f0be26
fix: ppo_micro_batch_size_per_gpu=16
tongyx361 095ab9b
revert: SGLang label
tongyx361 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,65 @@ | ||
| name: sgl | ||
|
|
||
| on: | ||
| workflow_dispatch: # Manual | ||
| # Trigger the workflow on push or pull request, | ||
| # but only for the main branch | ||
| push: | ||
| branches: | ||
| - main | ||
| - v0.2.x | ||
| paths: | ||
| - "**/*.py" | ||
| - .github/workflows/vllm.yml | ||
| pull_request: | ||
eric-haibin-lin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| branches: | ||
| - main | ||
| - v0.2.x | ||
| paths: | ||
SwordFaith marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - "**/*.py" | ||
| - "verl/trainer/config/*.yaml" | ||
| - .github/workflows/sgl.yml | ||
|
|
||
| # Cancel jobs on the same ref if a new one is triggered | ||
| concurrency: | ||
| group: ${{ github.workflow }}-${{ github.ref }} | ||
| cancel-in-progress: ${{ github.ref != 'refs/heads/main' }} | ||
|
|
||
| # Declare permissions just read content. | ||
| permissions: | ||
| contents: read | ||
|
|
||
| jobs: | ||
| sgl: | ||
| runs-on: [self-hosted, l20-0] | ||
SwordFaith marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| timeout-minutes: 20 # Increase this timeout value as needed | ||
| env: | ||
| HTTP_PROXY: ${{ secrets.PROXY_HTTP }} | ||
| HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }} | ||
| NO_PROXY: "localhost,127.0.0.1,hf-mirror.com" | ||
| HF_ENDPOINT: "https://hf-mirror.com" | ||
| HF_HUB_ENABLE_HF_TRANSFER: 1 | ||
SwordFaith marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| SGL_DISABLE_TP_MEMORY_INBALANCE_CHECK: "True" | ||
| container: | ||
| image: ocss884/verl-sglang:ngc-th2.6.0-cu126-sglang0.4.5.post3 | ||
| options: --gpus all --shm-size=10g | ||
| steps: | ||
| - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 | ||
| with: | ||
| fetch-depth: 0 | ||
| - name: Install the current repository | ||
| run: | | ||
| pip3 install hf_transfer | ||
| pip3 install -e .[test,gpu,sglang] --no-deps | ||
| - name: Test the latest SGLang | ||
| run: | | ||
| cd tests/rollout | ||
| torchrun --nnodes=1 --nproc_per_node=4 $(which pytest) -s test_sglang_spmd.py | ||
| - name: Test the latest SGLang async | ||
| run: | | ||
| cd tests/rollout | ||
| torchrun --nnodes=1 --nproc_per_node=2 $(which pytest) -s test_sglang_async_spmd.py | ||
| - name: Test the latest SGLang Rollout async with tool | ||
| run: | | ||
| cd tests/rollout | ||
| torchrun --nnodes=1 --nproc_per_node=2 $(which pytest) -s test_sglang_async_rollout_w_tools.py | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,40 @@ | ||
| Multi-turn Rollout Support | ||
SwordFaith marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ========================= | ||
|
|
||
| Basic Configuration | ||
| ~~~~~~~~~~~~~~~~~ | ||
|
|
||
| To enable multi-turn rollout, make sure to configure the following fields in your rollout configuration: | ||
|
|
||
| .. code-block:: yaml | ||
|
|
||
| actor_rollout_ref: | ||
| rollout: | ||
| multi_turn: True | ||
| name: "sglang_async" | ||
|
|
||
| These configuration activates the sglang_async engine for multi-turn interaction during rollout. | ||
|
|
||
| Custom Tool Configuration | ||
| ~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| For custom environment interaction tools, you can specify your tool configurations in a YAML file. | ||
| To do so, use the following format in your rollout config: | ||
|
|
||
| .. code-block:: yaml | ||
|
|
||
| actor_rollout_ref: | ||
| rollout: | ||
| tool_kwargs: | ||
| tools_config_file: <path_to_tool_yaml_file> | ||
|
|
||
| This allows integration of customized tool behaviors during actor rollout steps. You may refer to the GSM8KTool_example_configuration_ for guidance. | ||
|
|
||
| GSM8K Multi-turn Training Performance | ||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| See the training performance of multi-turn rollout on the GSM8K task HERE_. | ||
|
|
||
| .. _HERE: https://wandb.ai/zhaochenyang20/gsm8k_async_rl/runs/1ro1r7om?nw=nwuserzhaochenyang20 | ||
|
|
||
| .. _GSM8KTool_example_configuration: ../../examples/sglang_multiturn/config/tool_config/gsm8k_tool_config.yaml | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.