- 
                Notifications
    You must be signed in to change notification settings 
- Fork 210
feat: support NPU and vLLM #351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support NPU and vLLM #351
Conversation
cf8ee90    to
    b105ded      
    Compare
  
    | RID_CACHE_SIZE = 128 | ||
|  | ||
|  | ||
| class RemotevLLMEngine(InferenceEngine): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider adding comments to major classes/API this feature introduced for the convenience of understanding and maintenance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
| Please consider adding unit test cases to cover the majority modified/added logics. | 
afa387d    to
    56bad8b      
    Compare
  
    | For the convenience of reviewers, we split this large PR so that only NPU and vLLM related features are retained. And the training related ones will be submitted separately. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your great contribution! In general this PR looks good, but there are still some small issues such as code cleaning and file structures to be fixed.
Moreover, could you format the code with pre-commit and pass the formatting check? See https://inclusionai.github.io/AReaL/contrib.html#code-formatting for details.
This is still a large PR, could you help me double check this if you have time? @garrett4wade
        
          
                docs/lite/boba_ppo_vllm_npu.md
              
                Outdated
          
        
      | @@ -0,0 +1,85 @@ | |||
| # Running PPO with vLLM on BOBA dataset on NPU | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One page for NPU is sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified
ce27e08    to
    1713c4e      
    Compare
  
            
          
                areal/utils/launcher.py
              
                Outdated
          
        
      | return sglang_addrs | ||
|  | ||
|  | ||
| def wait_vllm_server_addrs( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it be merged with wait_for_sglang_addrs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, the merge has been done.
| message += f"TP rank: {rank} failed. reason: {msg}\n" | ||
| return to_json_response(success, message) | ||
|  | ||
| @router.post("/areal_update_weights") | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we rename these endpoints with the same name as sglang, does it mean that we can share the HTTP client code in SGLangRemoteEngine and vLLMRemoteEngine?
| @dazhuangzhuang1024 Hi, thank you for the great contribution. Could you please format the files with pre-commit? And please run the unit tests if possible. Formatting: pip install pre-commit
pre-commit install
git commit -a -m 'my commit'Testing: pip install -e .
pytest -sv areal/tests/ | 
3f147e7    to
    ab3d287      
    Compare
  
    b72b9bc    to
    2c97ea2      
    Compare
  
    | 
 OK, thanks for reminding. | 
| 这是来自QQ邮箱的自动回复邮件。您的来信已收到,我将及时处理。本邮件自动回复。                                                            ——lzs | 
12b526a    to
    7a28d1d      
    Compare
  
    c722438    to
    f2ac7bd      
    Compare
  
    | The formatting seems to be still incorrect. Try formatting all files with  | 
faba3e5    to
    3f0f486      
    Compare
  
    ea658ae    to
    b870064      
    Compare
  
    a8ef532    to
    9ffa725      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. cc @nuzant
Co-authored-by: shun001 <[email protected]> Co-authored-by: flemingpau <[email protected]> Co-authored-by: Moocharr <[email protected]> Co-authored-by: HUZZZW <[email protected]> Co-authored-by: ChengQianqian <[email protected]> Co-authored-by: zx506 <[email protected]> Co-authored-by: Shengzhou Lyu <[email protected]> Co-authored-by: flemingpau <[email protected]> Co-authored-by: casparcwang <[email protected]> Co-authored-by: dazhuangzhuang1024 <[email protected]> Co-authored-by: HShan886 <[email protected]>
1.Add comments to RemotevLLMEngine 2.Rename boba_ppo.py to boba_grpo.py and move to examples/math 3.Remove redundant comments.
…ts from the latest RemoteSGLangEngine
2.fix if is_npu_available formatting error
2.Update WeightUpdateMeta from_fsdp_nccl to from_fsdp_xccl
1.vLLMConfig add enable-metrics parameter 2.Use current_platform to eliminate the if-else branches introduced by is_npu_available
1.The `get_input_ids_fn` function should have an `enable_thinking` parameter 2.Fix npu README.md
599de5a    to
    6f7d661      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well.
Overview
vLLM and SGLang are both well-known inference engines and NPU is also widely used. We hope that AReaL support vLLM and NPU.
The main features of this PR include:
The user instructions are placed in
docs/lite/gms8k_ppo_vllm_npu.mdanddocs/lite/boba_ppo_vllm_npu.md.Reward Curve