-
Notifications
You must be signed in to change notification settings - Fork 284
CB: Hetero pipeline parallel support #2227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CB: Hetero pipeline parallel support #2227
Conversation
Hi @Wovchena , I check the failed test case in CI with Can you help to check if the CI works well? thanks!!! |
CI is broken. I don't know the component to blame yet |
Hi @Wovchena , i re-run the failed item in merge queue, but it seems can not be merged again after failed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR extends continuous batching to support multi-GPU execution by updating device assertions and block sizing logic.
- Relax the assertion to allow single CPU, single GPU, or multiple GPUs.
- Introduce
all_gpu_device
to drive block size and context initialization. - Replace per-GPU flag with
all_gpu_device
checks in cache manager.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
src/cpp/src/continuous_batching/pipeline_impl.cpp | Relax device assertion for heterogeneous pipelines and detect all-GPU deployments. |
src/cpp/src/continuous_batching/cache_manager.hpp | Use all_gpu_device for block size selection and context setup, replacing is_gpu logic. |
Comments suppressed due to low confidence (1)
src/cpp/src/continuous_batching/pipeline_impl.cpp:107
- [nitpick] Consider renaming
all_gpu_device
toall_gpu_devices
to better reflect that it checks a collection of devices.
const bool all_gpu_device =
std::all_of(execution_devices.begin(), execution_devices.end(), [&](const std::string& device) { | ||
return device.find("GPU") != std::string::npos; | ||
}); | ||
OPENVINO_ASSERT(all_gpu_device || execution_devices.size() == 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion allows empty execution_devices
(since all_of
on an empty vector is true). Add a check to ensure execution_devices
is non-empty before accessing index 0.
Copilot uses AI. Check for mistakes.
std::all_of(execution_devices.begin(), execution_devices.end(), [&](const std::string& device) { | ||
return device.find("GPU") != std::string::npos; | ||
}); | ||
OPENVINO_ASSERT(all_gpu_device || execution_devices.size() == 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, all_gpu_device
will be true for an empty vector. Ensure execution_devices
is not empty before using element 0 or combine this into the assertion.
Copilot uses AI. Check for mistakes.
11401c1
depend on: openvinotoolkit/openvino#30371
Tickets:CVS-164805