[CI/Build] fix cpu_extension for apple silicon #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
The purpose for this pr is to fix the cpu installation from source for apple silicon CPUs. Recently, basic serving capabilities and examples stopped working in my local setup (M3 pro). I bisected my way to this pr vllm-project#14129 which introduced int8 quantization for ARM CPU. The problem lied on the fact that although apple silicon shared some codepath with the rest of the arm cpu in
cpu_extension.cmake
, it also had some specific configurations that ended up breaking build after the int8 support was introduced; more specifically, apple silicon pathway was not enabling ASIMD_FOUND thus not includingquant.cpp
(cpu_extension.cmake:284
) source.After a fresh install from source
Basic example failed with
In order to fix, this pr enabled ASIMD_FOUND for apple silicon as well. For safety, it checked for support with the following command:
sysctl -n hw.optional.neon
. As far as I know, all apple silicon, starting from M1 up to M4 generation support this feature, but still decided to gate ASIMD_FOUND on this check in the case the support is dropped for future generations.After this fix, cpu installation from source started working again.
This pr also enables bf16 support for apple silicon. This feature is gated via the following check hw.optional.arm.FEAT_BF16. Based on this LLVM's commit, bf16 support was introduced for cpu in m2 generation.
Test Plan
Unfortunately I'm not familiar enough with build runs in CI, I would appreciate some guidance for this point.