Add `device_name` classmethod in `Accelerator`. #21112

GdoongMathew · 2025-08-24T08:30:17Z

What does this PR do?

1 GPU

GPU available: NVIDIA GeForce RTX 3050 4GB Laptop GPU, using: 1 devices.
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs

2 GPUs (Kaggle)

>>> t = Trainer(devices=2)
INFO: GPU available: Tesla T4, using: 2 devices.
INFO: TPU available: False, using: 0 TPU cores
INFO: HPU available: False, using: 0 HPUs

No cuda GPU available

GPU available: False, using: 0 devices.
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs

MPS (Need Help)
XLA (Kaggle)

GPU available: False, using: 0 devices.
TPU available: v3-8, using: 8 TPU cores
HPU available: False, using: 0 HPUs

Before submitting

Was this discussed/agreed via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG? (not for typos, docs, test updates, or minor internal changes/refactors)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:

Reviewer checklist

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

📚 Documentation preview 📚: https://pytorch-lightning--21112.org.readthedocs.build/en/21112/

src/lightning/pytorch/accelerators/cuda.py

GdoongMathew · 2025-08-27T16:41:49Z

Results in Kaggle:

justusschock

How would you handle heterogeneous gpu types? E.g. say I have one 3090 and one 4090 in my workstation. This doesn't handle it at all.

src/lightning/pytorch/accelerators/accelerator.py

src/lightning/pytorch/accelerators/mps.py

src/lightning/pytorch/accelerators/xla.py

src/lightning/pytorch/accelerators/cuda.py

GdoongMathew · 2025-08-28T10:07:08Z

How would you handle heterogeneous gpu types? E.g. say I have one 3090 and one 4090 in my workstation. This doesn't handle it at all.

Hi @justusschock , based on the change, the output should look something like

GPU available: RTX 3090, RTX 4090, using: 2 devices.

that being said, if the setup is more complex - for example 2 x 3090 and 1 x 4090 - the current output might not fully reflect that.

GdoongMathew · 2025-08-28T12:36:51Z

On second thought, I may have oversimplified the question. It looks like the current implementation could cause issues with the DDP training strategy, since it tries to access a non-existent device ID on rank zero.

Thanks for your review @Borda & @justusschock , I’ll mark this PR as WIP for now. Any further suggestions are welcome~

justusschock · 2025-08-28T14:53:03Z

that being said, if the setup is more complex - for example 2 x 3090 and 1 x 4090 - the current output might not fully reflect that.

@GdoongMathew I think that's fine. it's just important to reflect all available gpu types as that might impact memory etc.

GdoongMathew · 2025-08-28T16:25:39Z

On second thought, I may have oversimplified the question. It looks like the current implementation could cause issues with the DDP training strategy, since it tries to access a non-existent device ID on rank zero.

Thanks for your review @Borda & @justusschock , I’ll mark this PR as WIP for now. Any further suggestions are welcome~

To follow up on my own concern: it seems the device_ids property refers to devices on the current node, not all devices across the world view. So the current implementation probably won’t cause any issues, aside from not being able to list device types from other nodes.

feat: add device_name classmethod in Accelerator.

037a24b

GdoongMathew requested review from lantiga, Borda, tchaton, justusschock and ethanwharris as code owners August 24, 2025 08:30

github-actions bot added the pl Generic label for PyTorch Lightning package label Aug 24, 2025

GdoongMathew added 9 commits August 24, 2025 16:53

feat: change to original device logic in setup.

c1746b2

revert: revert changes in trainer.

a383d79

fix: fix type annotation.

7db8793

fix: fix type annotation.

3a3949a

add override decorator.

8cb809f

fix tests.

944ad69

fix tests.

1d9bd0d

fix mypy and device string format.

92b1d69

fix tests.

0a5725b

GdoongMathew changed the title ~~[WIP] Add device_name classmethod in Accelerator.~~ Add device_name classmethod in Accelerator. Aug 26, 2025

mps override decorator.

5b11bdb

Borda reviewed Aug 27, 2025

View reviewed changes

src/lightning/pytorch/accelerators/cuda.py Outdated Show resolved Hide resolved

Borda and others added 4 commits August 27, 2025 13:59

Merge branch 'master' into feat/device_name

4abcd13

empty str

63a0f70

return empty string if accelerator is not available.

8e91f8f

fix: fix unittests.

f124d55

This comment was marked as off-topic.

Sign in to view

justusschock reviewed Aug 28, 2025

View reviewed changes

src/lightning/pytorch/accelerators/accelerator.py Show resolved Hide resolved

src/lightning/pytorch/accelerators/mps.py Show resolved Hide resolved

src/lightning/pytorch/accelerators/xla.py Show resolved Hide resolved

src/lightning/pytorch/accelerators/cuda.py Show resolved Hide resolved

GdoongMathew marked this pull request as draft August 28, 2025 12:47

Merge branch 'master' into feat/device_name

aa16731

GdoongMathew marked this pull request as ready for review August 28, 2025 16:19

GdoongMathew requested review from Borda and justusschock August 28, 2025 16:26

Merge branch 'master' into feat/device_name

eddc009

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `device_name` classmethod in `Accelerator`. #21112

Add `device_name` classmethod in `Accelerator`. #21112

GdoongMathew commented Aug 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

This comment was marked as off-topic.

GdoongMathew commented Aug 27, 2025 •

edited

Loading

Uh oh!

justusschock left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GdoongMathew commented Aug 28, 2025

Uh oh!

GdoongMathew commented Aug 28, 2025

Uh oh!

justusschock commented Aug 28, 2025

Uh oh!

GdoongMathew commented Aug 28, 2025

Uh oh!

Uh oh!

Add device_name classmethod in Accelerator. #21112

Are you sure you want to change the base?

Add device_name classmethod in Accelerator. #21112

Conversation

GdoongMathew commented Aug 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

PR review

Uh oh!

Uh oh!

This comment was marked as off-topic.

GdoongMathew commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

justusschock left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GdoongMathew commented Aug 28, 2025

Uh oh!

GdoongMathew commented Aug 28, 2025

Uh oh!

justusschock commented Aug 28, 2025

Uh oh!

GdoongMathew commented Aug 28, 2025

Uh oh!

Uh oh!

Add `device_name` classmethod in `Accelerator`. #21112

Add `device_name` classmethod in `Accelerator`. #21112

GdoongMathew commented Aug 24, 2025 •

edited

Loading

GdoongMathew commented Aug 27, 2025 •

edited

Loading