-
Notifications
You must be signed in to change notification settings - Fork 210
feat: support NPU and vLLM #351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support NPU and vLLM #351
Conversation
cf8ee90 to
b105ded
Compare
| RID_CACHE_SIZE = 128 | ||
|
|
||
|
|
||
| class RemotevLLMEngine(InferenceEngine): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider adding comments to major classes/API this feature introduced for the convenience of understanding and maintenance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
|
Please consider adding unit test cases to cover the majority modified/added logics. |
afa387d to
56bad8b
Compare
|
For the convenience of reviewers, we split this large PR so that only NPU and vLLM related features are retained. And the training related ones will be submitted separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your great contribution! In general this PR looks good, but there are still some small issues such as code cleaning and file structures to be fixed.
Moreover, could you format the code with pre-commit and pass the formatting check? See https://inclusionai.github.io/AReaL/contrib.html#code-formatting for details.
This is still a large PR, could you help me double check this if you have time? @garrett4wade
docs/lite/boba_ppo_vllm_npu.md
Outdated
| @@ -0,0 +1,85 @@ | |||
| # Running PPO with vLLM on BOBA dataset on NPU | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One page for NPU is sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified
ce27e08 to
1713c4e
Compare
areal/utils/launcher.py
Outdated
| return sglang_addrs | ||
|
|
||
|
|
||
| def wait_vllm_server_addrs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it be merged with wait_for_sglang_addrs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, the merge has been done.
| message += f"TP rank: {rank} failed. reason: {msg}\n" | ||
| return to_json_response(success, message) | ||
|
|
||
| @router.post("/areal_update_weights") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we rename these endpoints with the same name as sglang, does it mean that we can share the HTTP client code in SGLangRemoteEngine and vLLMRemoteEngine?
|
@dazhuangzhuang1024 Hi, thank you for the great contribution. Could you please format the files with pre-commit? And please run the unit tests if possible. Formatting: pip install pre-commit
pre-commit install
git commit -a -m 'my commit'Testing: pip install -e .
pytest -sv areal/tests/ |
3f147e7 to
ab3d287
Compare
b72b9bc to
2c97ea2
Compare
OK, thanks for reminding. |
|
这是来自QQ邮箱的自动回复邮件。您的来信已收到,我将及时处理。本邮件自动回复。 ——lzs
|
12b526a to
7a28d1d
Compare
c722438 to
f2ac7bd
Compare
|
The formatting seems to be still incorrect. Try formatting all files with |
faba3e5 to
3f0f486
Compare
ea658ae to
b870064
Compare
a8ef532 to
9ffa725
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. cc @nuzant
Co-authored-by: shun001 <[email protected]> Co-authored-by: flemingpau <[email protected]> Co-authored-by: Moocharr <[email protected]> Co-authored-by: HUZZZW <[email protected]> Co-authored-by: ChengQianqian <[email protected]> Co-authored-by: zx506 <[email protected]> Co-authored-by: Shengzhou Lyu <[email protected]> Co-authored-by: flemingpau <[email protected]> Co-authored-by: casparcwang <[email protected]> Co-authored-by: dazhuangzhuang1024 <[email protected]> Co-authored-by: HShan886 <[email protected]>
1.Add comments to RemotevLLMEngine 2.Rename boba_ppo.py to boba_grpo.py and move to examples/math 3.Remove redundant comments.
…ts from the latest RemoteSGLangEngine
2.fix if is_npu_available formatting error
2.Update WeightUpdateMeta from_fsdp_nccl to from_fsdp_xccl
1.vLLMConfig add enable-metrics parameter 2.Use current_platform to eliminate the if-else branches introduced by is_npu_available
1.The `get_input_ids_fn` function should have an `enable_thinking` parameter 2.Fix npu README.md
599de5a to
6f7d661
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well.
Overview
vLLM and SGLang are both well-known inference engines and NPU is also widely used. We hope that AReaL support vLLM and NPU.
The main features of this PR include:
The user instructions are placed in
docs/lite/gms8k_ppo_vllm_npu.mdanddocs/lite/boba_ppo_vllm_npu.md.Reward Curve