feat: support NPU and vLLM #351

dazhuangzhuang1024 · 2025-09-17T08:21:09Z

Overview

vLLM and SGLang are both well-known inference engines and NPU is also widely used. We hope that AReaL support vLLM and NPU.

The main features of this PR include:

Provide a NPU runtime image containing vllm and vllm-ascend
Add a new platform NPU
Support vllm with vllm_worker_extension mechanism to decouple from vllm
Support interruptable processpool to correctly handle computing reward timeout

The user instructions are placed in docs/lite/gms8k_ppo_vllm_npu.md and docs/lite/boba_ppo_vllm_npu.md.

Reward Curve

Training DeepSeek-R1-Distill-Qwen-1.5B on gsm8k dataset with batch_size 256

Training DeepSeek-R1-Distill-Qwen-1.5B on boba dataset with batch_size 512

zhshgmail · 2025-09-17T20:57:09Z

areal/engine/vllm_remote.py

+RID_CACHE_SIZE = 128
+
+
+class RemotevLLMEngine(InferenceEngine):


Please consider adding comments to major classes/API this feature introduced for the convenience of understanding and maintenance.

zhshgmail · 2025-09-17T21:15:11Z

Please consider adding unit test cases to cover the majority modified/added logics.

dazhuangzhuang1024 · 2025-09-19T07:55:48Z

For the convenience of reviewers, we split this large PR so that only NPU and vLLM related features are retained. And the training related ones will be submitted separately.

nuzant

Thanks for your great contribution! In general this PR looks good, but there are still some small issues such as code cleaning and file structures to be fixed.

Moreover, could you format the code with pre-commit and pass the formatting check? See https://inclusionai.github.io/AReaL/contrib.html#code-formatting for details.

This is still a large PR, could you help me double check this if you have time? @garrett4wade

areal/api/cli_args.py

areal/api/reward_api.py

areal/platforms/__init__.py

areal/workflow/rlvr.py

docs/lite/boba_gsm8k_vllm_npu.md

nuzant · 2025-09-19T08:50:17Z

docs/lite/boba_ppo_vllm_npu.md

@@ -0,0 +1,85 @@
+# Running PPO with vLLM on BOBA dataset on NPU


Same as above.

One page for NPU is sufficient.

examples/lite/boba_ppo.py

garrett4wade · 2025-09-19T08:49:40Z

areal/utils/launcher.py

    return sglang_addrs


+def wait_vllm_server_addrs(


Can it be merged with wait_for_sglang_addrs?

OK, the merge has been done.

areal/api/cli_args.py

realhf/base/gpu_utils.py

pyproject.toml

examples/lite/boba_ppo.py

areal/launcher/areal_vllm_server.py

areal/engine/vllm_remote.py

garrett4wade · 2025-09-19T12:52:17Z

areal/launcher/areal_vllm_server.py

+            message += f"TP rank: {rank} failed. reason: {msg}\n"
+    return to_json_response(success, message)
+
+@router.post("/areal_update_weights")


If we rename these endpoints with the same name as sglang, does it mean that we can share the HTTP client code in SGLangRemoteEngine and vLLMRemoteEngine?

garrett4wade · 2025-09-19T13:00:15Z

@dazhuangzhuang1024 Hi, thank you for the great contribution. Could you please format the files with pre-commit? And please run the unit tests if possible.

Formatting:

pip install pre-commit
pre-commit install
git commit -a -m 'my commit'

Testing:

pip install -e .
pytest -sv areal/tests/

dazhuangzhuang1024 · 2025-09-22T15:01:28Z

@dazhuangzhuang1024 Hi, thank you for the great contribution. Could you please format the files with pre-commit? And please run the unit tests if possible.

Formatting:
pip install pre-commit
pre-commit install
git commit -a -m 'my commit'
Testing:
pip install -e .
pytest -sv areal/tests/

OK, thanks for reminding.

shun001 · 2025-09-22T15:02:04Z

这是来自QQ邮箱的自动回复邮件。您的来信已收到，我将及时处理。本邮件自动回复。 ——lzs

garrett4wade · 2025-09-23T03:35:46Z

The formatting seems to be still incorrect. Try formatting all files with pre-commit run --all-files.

garrett4wade

LGTM. cc @nuzant

Co-authored-by: shun001 <[email protected]> Co-authored-by: flemingpau <[email protected]> Co-authored-by: Moocharr <[email protected]> Co-authored-by: HUZZZW <[email protected]> Co-authored-by: ChengQianqian <[email protected]> Co-authored-by: zx506 <[email protected]> Co-authored-by: Shengzhou Lyu <[email protected]> Co-authored-by: flemingpau <[email protected]> Co-authored-by: casparcwang <[email protected]> Co-authored-by: dazhuangzhuang1024 <[email protected]> Co-authored-by: HShan886 <[email protected]>

1.Add comments to RemotevLLMEngine 2.Rename boba_ppo.py to boba_grpo.py and move to examples/math 3.Remove redundant comments.

…ts from the latest RemoteSGLangEngine

2.fix if is_npu_available formatting error

2.Update WeightUpdateMeta from_fsdp_nccl to from_fsdp_xccl

1.vLLMConfig add enable-metrics parameter 2.Use current_platform to eliminate the if-else branches introduced by is_npu_available

1.The `get_input_ids_fn` function should have an `enable_thinking` parameter 2.Fix npu README.md

nuzant

LGTM as well.

dazhuangzhuang1024 force-pushed the feat/vllm-and-npu branch from cf8ee90 to b105ded Compare September 17, 2025 08:31

zhshgmail reviewed Sep 17, 2025

View reviewed changes

dazhuangzhuang1024 force-pushed the feat/vllm-and-npu branch 2 times, most recently from afa387d to 56bad8b Compare September 19, 2025 07:46

dazhuangzhuang1024 had a problem deploying to AReaL-unittests September 19, 2025 08:56 — with GitHub Actions Error

nuzant reviewed Sep 19, 2025

View reviewed changes

Ronbogo force-pushed the feat/vllm-and-npu branch from ce27e08 to 1713c4e Compare September 19, 2025 12:07

garrett4wade reviewed Sep 19, 2025

View reviewed changes

Ronbogo had a problem deploying to AReaL-unittests September 19, 2025 12:59 — with GitHub Actions Error

garrett4wade mentioned this pull request Sep 20, 2025

[Roadmap] 2025 Q3 Milestones #257

Open

29 tasks

Ronbogo force-pushed the feat/vllm-and-npu branch from 3f147e7 to ab3d287 Compare September 22, 2025 03:01

shun001 force-pushed the feat/vllm-and-npu branch 2 times, most recently from b72b9bc to 2c97ea2 Compare September 22, 2025 14:52

shun001 force-pushed the feat/vllm-and-npu branch from 12b526a to 7a28d1d Compare September 22, 2025 16:32

dazhuangzhuang1024 force-pushed the feat/vllm-and-npu branch from c722438 to f2ac7bd Compare September 22, 2025 18:03

dazhuangzhuang1024 had a problem deploying to AReaL-unittests September 23, 2025 03:33 — with GitHub Actions Failure

dazhuangzhuang1024 had a problem deploying to AReaL-unittests September 23, 2025 03:33 — with GitHub Actions Error

shun001 force-pushed the feat/vllm-and-npu branch from faba3e5 to 3f0f486 Compare September 23, 2025 08:22

dazhuangzhuang1024 force-pushed the feat/vllm-and-npu branch from ea658ae to b870064 Compare September 23, 2025 12:37

shun001 force-pushed the feat/vllm-and-npu branch from a8ef532 to 9ffa725 Compare September 25, 2025 16:27

shun001 requested a deployment to AReaL-unittests September 26, 2025 06:23 — with GitHub Actions In progress

shun001 had a problem deploying to AReaL-unittests September 26, 2025 06:23 — with GitHub Actions Error

garrett4wade approved these changes Sep 26, 2025

View reviewed changes

Ronbogo and others added 17 commits September 26, 2025 14:29

fix update from xccl

44836f7

fix issues

97e4dcb

[fix] issues

e345dbf

1.Add comments to RemotevLLMEngine 2.Rename boba_ppo.py to boba_grpo.py and move to examples/math 3.Remove redundant comments.

[fix]boba workflow merge to workflow/rlvr.py

50073ad

fmt: format with pre-commit

7e75993

[fix] update RemoteSGLangEngine except for agenerate and update_weigh…

e63d24f

…ts from the latest RemoteSGLangEngine

boba uodate the training loop

d273566

[fix] 1.wait_vllm_server_addrs merge with wait_for_sglang_addrs

d9c453b

2.fix if is_npu_available formatting error

refactor: use transformers is_torch_npu_available util in platforms

9e8b556

[ut] Add vllm engine ut

5a14500

[fix] wait_llm_server_addrs bugfix

2158804

[fix] 1.fix ut issues

162ea10

2.Update WeightUpdateMeta from_fsdp_nccl to from_fsdp_xccl

fix: weight_update_mode missing

774ee0c

[fix]issues

4676d69

1.vLLMConfig add enable-metrics parameter 2.Use current_platform to eliminate the if-else branches introduced by is_npu_available

fmt: pre-commit

fe5fad9

[fix]issues

6f7d661

1.The `get_input_ids_fn` function should have an `enable_thinking` parameter 2.Fix npu README.md

dazhuangzhuang1024 force-pushed the feat/vllm-and-npu branch from 599de5a to 6f7d661 Compare September 26, 2025 06:29

dazhuangzhuang1024 had a problem deploying to AReaL-unittests September 26, 2025 07:31 — with GitHub Actions Error

dazhuangzhuang1024 requested a deployment to AReaL-unittests September 26, 2025 07:31 — with GitHub Actions In progress

dazhuangzhuang1024 had a problem deploying to AReaL-unittests September 26, 2025 07:31 — with GitHub Actions Error

nuzant approved these changes Sep 26, 2025

View reviewed changes

nuzant merged commit a327a9f into inclusionAI:main Sep 26, 2025
1 of 4 checks passed

This was referenced Sep 28, 2025

Change the generation backend to vllm #95

Closed

[Feature] Support Huawei Ascend #348

Closed

		RID_CACHE_SIZE = 128


		class RemotevLLMEngine(InferenceEngine):

		@@ -0,0 +1,85 @@
		# Running PPO with vLLM on BOBA dataset on NPU

feat: support NPU and vLLM #351

feat: support NPU and vLLM #351

Uh oh!

Conversation

dazhuangzhuang1024 commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Reward Curve

Uh oh!

zhshgmail Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

shun001 Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

zhshgmail commented Sep 17, 2025

Uh oh!

dazhuangzhuang1024 commented Sep 19, 2025

Uh oh!

nuzant left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nuzant Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

garrett4wade Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

shun001 Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

garrett4wade Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

shun001 Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

garrett4wade Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

garrett4wade commented Sep 19, 2025

Uh oh!

dazhuangzhuang1024 commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shun001 commented Sep 22, 2025 via email

Uh oh!

garrett4wade commented Sep 23, 2025

Uh oh!

garrett4wade left a comment

Choose a reason for hiding this comment

Uh oh!

nuzant left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

dazhuangzhuang1024 commented Sep 17, 2025 •

edited

Loading

nuzant left a comment •

edited

Loading

nuzant Sep 19, 2025 •

edited

Loading

dazhuangzhuang1024 commented Sep 22, 2025 •

edited

Loading