-
Notifications
You must be signed in to change notification settings - Fork 237
Only enumerate ROCm-capable AMD GPUs #1500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
497e20c
to
7bf6748
Compare
Discover AMD graphics devices using AMDKFD topology instead of enumerating the PCIe bus. This interface exposes a lot more information about potential devices, allowing RamaLama to filter out unsupported devices. Currently, devices older than GFX9 are filtered, as they are no longer supported by ROCm. Signed-off-by: Leorize <[email protected]>
7bf6748
to
fab8765
Compare
At the moment, RamaLama will fallback to CPU inference for unsupported AMD GPUs since |
Ideally this should be tested on AMD APUs systems. I have only verified this implementation with my dGPU. |
Not a big fan of inner functions, it encourages indenting for little benefit, can we just make them normal functions? |
Thanks for completing this, the logic seems to make sense, not that I tested |
As regards "I'm not sure how to detect if a Vulkan-capable accelerator is available to force passing:" We recently integrated this change: Maybe we want to default to We could parse this just before we execute llama-server also potentially:
|
Signed-off-by: Leorize <[email protected]>
Signed-off-by: Leorize <[email protected]>
Signed-off-by: Leorize <[email protected]>
I actually tested that, and it tries to do repacks for CPU. I'm not sure if that's a good or a bad thing. But either way I think that change should be done separately.
|
Yeah... There's a patch from llama.cpp merged hours ago that might change this behaviour though. Did you try with that patch?
It's installed in all our container images with vulkan, so if it's done just before where llama-server is execute it should be fine in the containerized case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR replaces PCIe-based enumeration with AMDKFD topology to detect ROCm-capable AMD GPUs, filters out unsupported older architectures, and accounts for VRAM across memory banks.
- Add
ramalama.amdkfd
module for parsing KFD properties and listing GPU nodes - Update
check_rocm_amd()
to use KFD topology, filter gfx versions <90000, and sum VRAM from public/private heaps - Introduce
MIN_VRAM_BYTES
constant (1 GiB) for minimum VRAM threshold
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
ramalama/common.py | Import new module, add VRAM threshold constant, overhaul GPU scan |
ramalama/amdkfd.py | New utilities for reading KFD sysfs properties and listing GPUs |
Comments suppressed due to low confidence (3)
ramalama/common.py:453
- [nitpick] The variable name
np
is ambiguous and commonly associated with NumPy; consider renaming tonode_path
or similar for clarity.
for i, (np, props) in enumerate(amdkfd.gpus()):
ramalama/common.py:470
- This assignment is outside the
if mem_bytes > ...
block, sogpu_num
will be set on every iteration rather than only when a larger VRAM GPU is found. It should be indented inside theif
.
gpu_num = i
ramalama/amdkfd.py:12
- The new
gpus()
function andparse_props()
logic lack tests; consider adding unit tests or mocks for sysfs reads to verify correct parsing and filtering.
def gpus():
mem_banks_count = int(props['mem_banks_count']) | ||
mem_bytes = 0 | ||
for bank in range(mem_banks_count): | ||
bank_props = amdkfd.parse_props(np + f'/mem_banks/{bank}/properties') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Path construction via string concatenation can be fragile; prefer os.path.join(np, 'mem_banks', str(bank), 'properties')
.
bank_props = amdkfd.parse_props(np + f'/mem_banks/{bank}/properties') | |
bank_props = amdkfd.parse_props(os.path.join(np, 'mem_banks', str(bank), 'properties')) |
Copilot uses AI. Check for mistakes.
# See /usr/include/linux/kfd_sysfs.h for possible heap types | ||
# | ||
# Count public and private framebuffer memory as VRAM | ||
if bank_props['heap_type'] in [1, 2]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Magic numbers 1
and 2
for heap types are unclear; consider defining named constants or an enum for readability and future maintenance.
if bank_props['heap_type'] in [1, 2]: | |
if bank_props['heap_type'] in [HEAP_TYPE_PUBLIC, HEAP_TYPE_PRIVATE]: |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is correct we should try and take this is another PR @alaviss . I have no idea what 1 and 2 mean.
Discover AMD graphics devices using AMDKFD topology instead of enumerating the PCIe bus. This interface exposes a lot more information about potential devices, allowing RamaLama to filter out unsupported devices.
Currently, devices older than GFX9 are filtered, as they are no longer supported by ROCm.
Ref: #1482
Summary by Sourcery
Use the AMDKFD sysfs topology interface to detect and select ROCm-capable AMD GPUs by architecture and VRAM capacity.
New Features: