Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
120 commits
Select commit Hold shift + click to select a range
d7a70a5
Add new designed data model for tool and async rollout
SwordFaith Apr 2, 2025
dc578c1
add AsyncRolloutRequest and message-ids convert tools
zyzshishui Apr 4, 2025
0edc910
Add unittest to async sgl engine and related ci
SwordFaith Apr 4, 2025
59df7ab
Fix test case and add verl engine with async
SwordFaith Apr 7, 2025
63caefe
Fix AsyncVerlEngine naming
SwordFaith Apr 7, 2025
2eea830
Dump test async sgl rollout checkpoint
SwordFaith Apr 7, 2025
4199cf8
Fix async rollout with memory saver issue
SwordFaith Apr 8, 2025
ed46763
Tmp dump for pass messages by raw_prompt key in non_tensor_batch
SwordFaith Apr 8, 2025
bc4fe76
Add example gsm8k tool and fix schema issue
SwordFaith Apr 8, 2025
d2522f0
Tmp dump for add prompt to async rollout request and async rollout st…
SwordFaith Apr 9, 2025
e0b3bbb
Wait to add more test for async gen with tools and verify training se…
SwordFaith Apr 10, 2025
4532d33
Add support for tool_calls formating and parse error
SwordFaith Apr 10, 2025
c415ad4
support loss_mask and loading tool from config
zyzshishui Apr 12, 2025
6a91862
fix review
zyzshishui Apr 12, 2025
c3831f2
fix review
zyzshishui Apr 12, 2025
7c063a9
Fix get_tool_call_parser_type bot/eot check logic
SwordFaith Apr 13, 2025
3cbf99b
Convert tool method and fn call to all async
SwordFaith Apr 13, 2025
c730d48
merge and add rst file for multiturn
zyzshishui Apr 13, 2025
983c28e
fixed tool initialization
zyzshishui Apr 13, 2025
2cb290c
Add unittest with sharding manager and rollout, and fix async generat…
SwordFaith Apr 13, 2025
8d18e43
wait to support batch rollout
zyzshishui Apr 13, 2025
a88ddda
add max_turn
zyzshishui Apr 14, 2025
bdfd998
add e2e test for tool calling
zyzshishui Apr 14, 2025
26ade88
fix batch rollout
zyzshishui Apr 14, 2025
c437093
Merge branch 'feat/add_async_sglang_multi_turn_support' into feat/los…
SwordFaith Apr 14, 2025
8e7b00b
Merge pull request #2 from zyzshishui/feat/loss_mask_and_tool_config
SwordFaith Apr 14, 2025
fd3b44c
Dump first runable version
SwordFaith Apr 14, 2025
1ff236c
Fix config missing in dump
SwordFaith Apr 14, 2025
77f0429
Try use mb cluster training
SwordFaith Apr 15, 2025
c0d710d
Fix script pip install bug
SwordFaith Apr 15, 2025
0e627d7
Disable debug data batch keys log
SwordFaith Apr 15, 2025
1a803ea
Refined to correctly run grpo in a stable way
SwordFaith Apr 15, 2025
b215f02
Fix ckpt path and wandb dir
SwordFaith Apr 15, 2025
9c11a1e
Add update torch-memory-saver and wandb
SwordFaith Apr 16, 2025
6062a8c
Add new scripts for verify
SwordFaith Apr 16, 2025
bc6043b
Fix config bugs
SwordFaith Apr 16, 2025
9abf201
Try fix import sgl when using vllm only error
SwordFaith Apr 16, 2025
28d6268
Try increase n=16
SwordFaith Apr 16, 2025
a556981
Add n=64 setting to increasing rate of convergence
SwordFaith Apr 16, 2025
6a479cc
Add temperature 1.0 support
SwordFaith Apr 16, 2025
a29fac3
Update config
SwordFaith Apr 23, 2025
df51df2
Dump debug code
SwordFaith Apr 23, 2025
9b50d00
Add verification for single turn generation
SwordFaith Apr 23, 2025
f20aed7
Fix bug
SwordFaith Apr 23, 2025
67b15e6
Add 3k max resp script
SwordFaith Apr 24, 2025
15ff9c5
Use patch with aligned cli args
SwordFaith Apr 24, 2025
5a372b0
Fix arg
SwordFaith Apr 24, 2025
439d816
Add new version data script
SwordFaith Apr 24, 2025
8ce6689
Try if it's padding issue
SwordFaith Apr 25, 2025
f60ac69
Add pad test script
SwordFaith Apr 25, 2025
679c324
Fix dtype in pad_sequence_to_length issue
SwordFaith Apr 25, 2025
07e25e6
Add new training script
SwordFaith Apr 25, 2025
ddb2337
Add pad print
SwordFaith Apr 25, 2025
25e7814
Add debug log
SwordFaith Apr 25, 2025
6f6f94e
Try refine debug logs
SwordFaith Apr 25, 2025
185cf74
Add n16 train script
SwordFaith Apr 25, 2025
c097cb1
Add loss mask test script
SwordFaith Apr 25, 2025
e839465
Add debug info to pad and unpad
SwordFaith Apr 26, 2025
560e2ca
Add schema update and format, need_tool_kwargs fix
SwordFaith Apr 26, 2025
b939760
try to change to relative path
WANG-GH Apr 26, 2025
0719d1f
fix abs path
WANG-GH Apr 26, 2025
9f17ace
Remove verl engine with async and add modelbest inc as co-org
SwordFaith Apr 26, 2025
6ecb8d4
fix conlficts in RLHF dataset
zhaochenyang20 Apr 27, 2025
80e3c2c
Merge branch 'main' of github.com:SwordFaith/verl
zhaochenyang20 Apr 27, 2025
0729b07
Merge branch 'main' into feat/renewed_multi_turn_pr_branch
zhaochenyang20 Apr 27, 2025
caf1a08
resolve not our code
WANG-GH Apr 27, 2025
a7ffe12
fix format v2
WANG-GH Apr 27, 2025
20f5696
Fix example run issue
SwordFaith Apr 27, 2025
6cfce61
Add environ SGL_DISABLE_TP_MEMORY_INBALANCE_CHECK=True
SwordFaith Apr 27, 2025
6e7017d
fix lint v1
WANG-GH Apr 27, 2025
4ea91e3
fix lint v2
WANG-GH Apr 27, 2025
80be127
Fix dapo ci
SwordFaith Apr 27, 2025
67f218f
Fix sanity check ci
SwordFaith Apr 27, 2025
092b7b9
Add back reinforce plus plus base line
SwordFaith Apr 27, 2025
2066bae
Fix file naming and dir structure issue
SwordFaith Apr 27, 2025
f2e8a6b
Fix license str
SwordFaith Apr 27, 2025
9efb2be
Fix req_list append missing
SwordFaith Apr 27, 2025
a195328
Sync generate sequences with sglang_rollout
SwordFaith Apr 27, 2025
897edf4
stop before enter fsdp_async_manager
WANG-GH Apr 28, 2025
10dff84
Add config validation assertion
SwordFaith Apr 28, 2025
e19fcaa
Refactor async sglang sharding manager
SwordFaith Apr 28, 2025
377bf06
repeat tool_kwargs & upgrade ppo_megatron_trainer
ocss884 Apr 28, 2025
b6fa35b
Fix e2e test too long
SwordFaith Apr 28, 2025
9805be8
refact test v1
WANG-GH Apr 28, 2025
3aebb8c
unfinish
WANG-GH Apr 28, 2025
712c2ef
refact test v3
WANG-GH Apr 28, 2025
a7236b8
refact test v4
WANG-GH Apr 28, 2025
5569eaf
Fix import path error and update ci config
SwordFaith Apr 28, 2025
48ad489
Merge branch 'main' into feat/add_async_sglang_multi_turn_support
tongyx361 Apr 28, 2025
9e66aec
fix: pre-commit
tongyx361 Apr 28, 2025
b078914
fix: license
tongyx361 Apr 28, 2025
508d382
fix: tools_kwargs
tongyx361 Apr 28, 2025
eb4f1cb
fix: .mypy_cache
tongyx361 Apr 28, 2025
5886337
Merge pull request #10 from tongyx361/tyx/fix/ci
tongyx361 Apr 28, 2025
f79d706
update async test
ocss884 Apr 28, 2025
a8eb256
pre-commit
ocss884 Apr 28, 2025
7020582
fix
ocss884 Apr 28, 2025
3468b4c
Fix lint and add mutiturn and sglang_async to e2e ppo trainer
SwordFaith Apr 28, 2025
70b7169
Fix broadcast_pyobj args
SwordFaith Apr 28, 2025
3501c46
fix sgl.yml
ocss884 Apr 28, 2025
828c05b
feat: sgl CI manual trigger
tongyx361 Apr 28, 2025
a8d6142
Fix nproc issue in sgl.yml
SwordFaith Apr 28, 2025
6b8552f
fix sgl.yml
ocss884 Apr 28, 2025
029d397
more
ocss884 Apr 28, 2025
db781c0
fix wandb
eric-haibin-lin Apr 28, 2025
2153056
Fix cpu backend missing bug
SwordFaith Apr 29, 2025
83a735d
Add sharding manager to unit test
SwordFaith Apr 29, 2025
270b7b0
Seperate e2e ppo trainer sglang task
SwordFaith Apr 29, 2025
040237e
Avoid validation and lower test time
SwordFaith Apr 29, 2025
3e21b08
feat: secrets.HF_ENDPOINT
tongyx361 Apr 29, 2025
1105c3e
fix: secrets.HF_ENDPOINT
tongyx361 Apr 29, 2025
766a082
feat: https://hf-mirror.com
tongyx361 Apr 29, 2025
9912290
temp: only test ppo_trainer for sglang
tongyx361 Apr 29, 2025
6ee96e1
Revert "temp: only test ppo_trainer for sglang"
tongyx361 Apr 29, 2025
ef7d527
temp: SGLang label
tongyx361 Apr 29, 2025
52a7ee6
Revert "temp: SGLang label"
tongyx361 Apr 29, 2025
1d4d648
temp: SGLang label
tongyx361 Apr 29, 2025
05413e3
feat: no_proxy for hf-mirror.com
tongyx361 Apr 29, 2025
6f0be26
fix: ppo_micro_batch_size_per_gpu=16
tongyx361 Apr 29, 2025
095ab9b
revert: SGLang label
tongyx361 Apr 29, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/dataset.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: whatcanyousee/verl:ngc-cu124-vllm0.8.3-sglang0.4.5-mcore0.12.0-te2.2
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/e2e_dapo.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: whatcanyousee/verl:ngc-cu124-vllm0.8.3-sglang0.4.5-mcore0.12.0-te2.2
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/e2e_eval_aime24.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: whatcanyousee/verl:ngc-cu124-vllm0.8.3-sglang0.4.5-mcore0.12.0-te2.2
Expand Down
70 changes: 66 additions & 4 deletions .github/workflows/e2e_ppo_trainer.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: whatcanyousee/verl:ngc-cu124-vllm0.8.3-sglang0.4.5-mcore0.12.0-te2.2
Expand Down Expand Up @@ -139,7 +140,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: hiyouga/verl:ngc-th2.6.0-cu126-vllm0.8.3-flashinfer0.2.2-cxx11abi0
Expand Down Expand Up @@ -172,7 +174,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: ocss884/verl-sglang:ngc-th2.6.0-cu126-sglang0.4.5.post3
Expand All @@ -193,14 +196,73 @@ jobs:
ray stop --force
ENGINE=sglang bash tests/e2e/ppo_trainer/run_function_reward.sh

e2e_ppo_trainer_sglang_async:
runs-on: [L20x8]
needs: pre_commit_for_ppo
timeout-minutes: 40 # Increase this timeout value as needed
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: ocss884/verl-sglang:ngc-th2.6.0-cu126-sglang0.4.5.post3
options: --gpus all --shm-size=10g
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
fetch-depth: 0
- name: Install the current repository
run: |
pip3 install -e .[test,gpu,sglang] --no-deps
- name: Prepare gsm8k dataset
run: |
ray stop --force
python3 examples/data_preprocess/gsm8k.py
- name: Running GSM8K E2E training tests on 8 L20 GPUs with rmpad using function rm and save ckpt with sglang async
run: |
ray stop --force
ENGINE=sglang_async bash tests/e2e/ppo_trainer/run_function_reward.sh

e2e_ppo_trainer_sglang_async_with_tool:
runs-on: [L20x8]
needs: pre_commit_for_ppo
timeout-minutes: 40 # Increase this timeout value as needed
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: ocss884/verl-sglang:ngc-th2.6.0-cu126-sglang0.4.5.post3
options: --gpus all --shm-size=10g
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
fetch-depth: 0
- name: Install the current repository
run: |
pip3 install -e .[test,gpu,sglang] --no-deps
- name: Prepare gsm8k dataset with tool
run: |
ray stop --force
python3 examples/data_preprocess/gsm8k_multiturn_w_tool.py --local_dir $HOME/data/gsm8k_verl_sgl_multi_turn_preprocessed
- name: Running GSM8K with tool E2E training tests on 8 L20 GPUs with rmpad using function rm and save ckpt with sglang async
run: |
ray stop --force
bash tests/e2e/run_gsm8k_fsdp_sgl_multiturn_w_tool.sh

e2e_ppo_trainer_sglang_vlm:
runs-on: [L20x8]
needs: pre_commit_for_ppo
timeout-minutes: 40 # Increase this timeout value as needed
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: ocss884/verl-sglang:ngc-th2.6.0-cu126-sglang0.4.5.post3
Expand Down
6 changes: 4 additions & 2 deletions .github/workflows/e2e_ppo_trainer_megatron.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: whatcanyousee/verl:ngc-cu124-vllm0.8.3-sglang0.4.5-mcore0.12.0-te2.2
Expand Down Expand Up @@ -82,7 +83,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: whatcanyousee/verl:ngc-cu124-vllm0.8.3-sglang0.4.5-mcore0.12.0-te2.2
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/e2e_prime.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: whatcanyousee/verl:ngc-cu124-vllm0.8.3-sglang0.4.5-mcore0.12.0-te2.2
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/e2e_sft.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: whatcanyousee/verl:ngc-cu124-vllm0.8.3-sglang0.4.5-mcore0.12.0-te2.2
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/model.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: whatcanyousee/verl:ngc-cu124-vllm0.8.3-sglang0.4.5-mcore0.12.0-te2.2
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/ray_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: whatcanyousee/verl:ngc-cu124-vllm0.8.3-sglang0.4.5-mcore0.12.0-te2.2
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/sandbox.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: whatcanyousee/verl:ngc-cu124-vllm0.8.3-sglang0.4.5-mcore0.12.0-te2.2
Expand Down
65 changes: 65 additions & 0 deletions .github/workflows/sgl.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
name: sgl

on:
workflow_dispatch: # Manual
# Trigger the workflow on push or pull request,
# but only for the main branch
push:
branches:
- main
- v0.2.x
paths:
- "**/*.py"
- .github/workflows/vllm.yml
pull_request:
branches:
- main
- v0.2.x
paths:
- "**/*.py"
- "verl/trainer/config/*.yaml"
- .github/workflows/sgl.yml

# Cancel jobs on the same ref if a new one is triggered
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}

# Declare permissions just read content.
permissions:
contents: read

jobs:
sgl:
runs-on: [self-hosted, l20-0]
timeout-minutes: 20 # Increase this timeout value as needed
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: 1
SGL_DISABLE_TP_MEMORY_INBALANCE_CHECK: "True"
container:
image: ocss884/verl-sglang:ngc-th2.6.0-cu126-sglang0.4.5.post3
options: --gpus all --shm-size=10g
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
fetch-depth: 0
- name: Install the current repository
run: |
pip3 install hf_transfer
pip3 install -e .[test,gpu,sglang] --no-deps
- name: Test the latest SGLang
run: |
cd tests/rollout
torchrun --nnodes=1 --nproc_per_node=4 $(which pytest) -s test_sglang_spmd.py
- name: Test the latest SGLang async
run: |
cd tests/rollout
torchrun --nnodes=1 --nproc_per_node=2 $(which pytest) -s test_sglang_async_spmd.py
- name: Test the latest SGLang Rollout async with tool
run: |
cd tests/rollout
torchrun --nnodes=1 --nproc_per_node=2 $(which pytest) -s test_sglang_async_rollout_w_tools.py
4 changes: 2 additions & 2 deletions .github/workflows/vllm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,9 @@ jobs:
env:
HTTP_PROXY: ${{ secrets.PROXY_HTTP }}
HTTPS_PROXY: ${{ secrets.PROXY_HTTPS }}
NO_PROXY: "localhost,127.0.0.1"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
NO_PROXY: "localhost,127.0.0.1,hf-mirror.com"
HF_ENDPOINT: "https://hf-mirror.com"
HF_HUB_ENABLE_HF_TRANSFER: "0" # This is more stable
container:
image: whatcanyousee/verl:vemlp-th2.4.0-cu124-vllm0.6.3-ray2.10-te2.0-megatron0.11.0-v0.0.6
options: --gpus all --shm-size=10g
Expand Down
40 changes: 40 additions & 0 deletions docs/sglang_multiturn/multiturn.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
Multi-turn Rollout Support
=========================

Basic Configuration
~~~~~~~~~~~~~~~~~

To enable multi-turn rollout, make sure to configure the following fields in your rollout configuration:

.. code-block:: yaml

actor_rollout_ref:
rollout:
multi_turn: True
name: "sglang_async"

These configuration activates the sglang_async engine for multi-turn interaction during rollout.

Custom Tool Configuration
~~~~~~~~~~~~~~~~~~~~~~~

For custom environment interaction tools, you can specify your tool configurations in a YAML file.
To do so, use the following format in your rollout config:

.. code-block:: yaml

actor_rollout_ref:
rollout:
tool_kwargs:
tools_config_file: <path_to_tool_yaml_file>

This allows integration of customized tool behaviors during actor rollout steps. You may refer to the GSM8KTool_example_configuration_ for guidance.

GSM8K Multi-turn Training Performance
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

See the training performance of multi-turn rollout on the GSM8K task HERE_.

.. _HERE: https://wandb.ai/zhaochenyang20/gsm8k_async_rl/runs/1ro1r7om?nw=nwuserzhaochenyang20

.. _GSM8KTool_example_configuration: ../../examples/sglang_multiturn/config/tool_config/gsm8k_tool_config.yaml
Loading