Skip to content

Conversation

@dazhuangzhuang1024
Copy link
Contributor

@dazhuangzhuang1024 dazhuangzhuang1024 commented Sep 17, 2025

Overview

vLLM and SGLang are both well-known inference engines and NPU is also widely used. We hope that AReaL support vLLM and NPU.

The main features of this PR include:

  • Provide a NPU runtime image containing vllm and vllm-ascend
  • Add a new platform NPU
  • Support vllm with vllm_worker_extension mechanism to decouple from vllm
  • Support interruptable processpool to correctly handle computing reward timeout

The user instructions are placed in docs/lite/gms8k_ppo_vllm_npu.md and docs/lite/boba_ppo_vllm_npu.md.

Reward Curve

  • Training DeepSeek-R1-Distill-Qwen-1.5B on gsm8k dataset with batch_size 256
image
  • Training DeepSeek-R1-Distill-Qwen-1.5B on boba dataset with batch_size 512
image

RID_CACHE_SIZE = 128


class RemotevLLMEngine(InferenceEngine):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider adding comments to major classes/API this feature introduced for the convenience of understanding and maintenance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@zhshgmail
Copy link

Please consider adding unit test cases to cover the majority modified/added logics.

@dazhuangzhuang1024 dazhuangzhuang1024 force-pushed the feat/vllm-and-npu branch 2 times, most recently from afa387d to 56bad8b Compare September 19, 2025 07:46
@dazhuangzhuang1024
Copy link
Contributor Author

For the convenience of reviewers, we split this large PR so that only NPU and vLLM related features are retained. And the training related ones will be submitted separately.

Copy link
Collaborator

@nuzant nuzant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your great contribution! In general this PR looks good, but there are still some small issues such as code cleaning and file structures to be fixed.

Moreover, could you format the code with pre-commit and pass the formatting check? See https://inclusionai.github.io/AReaL/contrib.html#code-formatting for details.

This is still a large PR, could you help me double check this if you have time? @garrett4wade

@@ -0,0 +1,85 @@
# Running PPO with vLLM on BOBA dataset on NPU
Copy link
Collaborator

@nuzant nuzant Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One page for NPU is sufficient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified

return sglang_addrs


def wait_vllm_server_addrs(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be merged with wait_for_sglang_addrs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, the merge has been done.

message += f"TP rank: {rank} failed. reason: {msg}\n"
return to_json_response(success, message)

@router.post("/areal_update_weights")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we rename these endpoints with the same name as sglang, does it mean that we can share the HTTP client code in SGLangRemoteEngine and vLLMRemoteEngine?

@garrett4wade
Copy link
Collaborator

@dazhuangzhuang1024 Hi, thank you for the great contribution. Could you please format the files with pre-commit? And please run the unit tests if possible.

Formatting:

pip install pre-commit
pre-commit install
git commit -a -m 'my commit'

Testing:

pip install -e .
pytest -sv areal/tests/

@garrett4wade garrett4wade mentioned this pull request Sep 20, 2025
29 tasks
@shun001 shun001 force-pushed the feat/vllm-and-npu branch 2 times, most recently from b72b9bc to 2c97ea2 Compare September 22, 2025 14:52
@dazhuangzhuang1024
Copy link
Contributor Author

dazhuangzhuang1024 commented Sep 22, 2025

@dazhuangzhuang1024 Hi, thank you for the great contribution. Could you please format the files with pre-commit? And please run the unit tests if possible.

Formatting:

pip install pre-commit
pre-commit install
git commit -a -m 'my commit'

Testing:

pip install -e .
pytest -sv areal/tests/

OK, thanks for reminding.

@shun001
Copy link
Contributor

shun001 commented Sep 22, 2025 via email

@garrett4wade
Copy link
Collaborator

The formatting seems to be still incorrect. Try formatting all files with pre-commit run --all-files.

Copy link
Collaborator

@garrett4wade garrett4wade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. cc @nuzant

Ronbogo and others added 17 commits September 26, 2025 14:29
Co-authored-by: shun001 <[email protected]>
Co-authored-by: flemingpau <[email protected]>
Co-authored-by: Moocharr <[email protected]>
Co-authored-by: HUZZZW <[email protected]>
Co-authored-by: ChengQianqian <[email protected]>
Co-authored-by: zx506 <[email protected]>
Co-authored-by: Shengzhou Lyu <[email protected]>
Co-authored-by: flemingpau <[email protected]>
Co-authored-by: casparcwang <[email protected]>
Co-authored-by: dazhuangzhuang1024 <[email protected]>
Co-authored-by: HShan886 <[email protected]>
1.Add comments to RemotevLLMEngine
2.Rename boba_ppo.py to boba_grpo.py and move to examples/math
3.Remove redundant comments.
2.fix if is_npu_available formatting error
2.Update WeightUpdateMeta from_fsdp_nccl to from_fsdp_xccl
1.vLLMConfig add enable-metrics parameter
2.Use current_platform to eliminate the if-else branches introduced by is_npu_available
1.The `get_input_ids_fn` function should have an `enable_thinking` parameter
2.Fix npu README.md
Copy link
Collaborator

@nuzant nuzant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well.

@nuzant nuzant merged commit a327a9f into inclusionAI:main Sep 26, 2025
1 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants