NPU support SDPA #35165

zheliuyu · 2024-12-09T11:40:02Z

What does this PR do?

add features: Ascend NPU supports SDPA.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

notes

Ascend NPU requires torch>=2.1.0 to use SDPA in Transformers.

SunMarc

Thanks for the PR ! Is there a link that shows that npu is compatible with sdpa from torch 2.1.0 ? Also let us know when this is ready to be reviewed !

zheliuyu · 2024-12-24T07:56:33Z

Thanks for the PR ! Is there a link that shows that npu is compatible with sdpa from torch 2.1.0 ? Also let us know when this is ready to be reviewed !

@SunMarc Thanks for your reply, this PR ready to be reviewed. Some explanations and tests are as follows.

NPU supports SDPA in torch>=2.1.0

To use SDPA in NPU, simply import torch_npu.
For GPU

import torch
import torch.nn.functional as F


query = torch.ones(1, 2, dtype=torch.float16, device="cuda")
key = torch.ones(1, 2, dtype=torch.float16, device="cuda")
value = torch.ones(1, 2, dtype=torch.float16, device="cuda")

output = F.scaled_dot_product_attention(query, key, value)
print("torch version: ", torch.__version__)
print("result: ", output)

torch version:  2.1.0+cu121
result:  tensor([[1., 1.]], device='cuda:0', dtype=torch.float16)

For NPU

import torch
import torch_npu
import torch.nn.functional as F


query = torch.ones(1, 2, dtype=torch.float16, device="npu:0")
key = torch.ones(1, 2, dtype=torch.float16, device="npu:0")
value = torch.ones(1, 2, dtype=torch.float16, device="npu:0")

output = F.scaled_dot_product_attention(query, key, value)
print("torch version: ", torch.__version__)
print("torch_npu version: ", torch_npu.__version__)
print("result: ", output)

torch version:  2.1.0
torch_npu version: 2.1.0
result:  tensor([[1., 1.]], device='npu:0', dtype=torch.float16)

NPU is OK with non-contiguous inputs in torch>=2.1.0

According to the issue 112577, transformers requires torch>=2.1.1 to avoid a numerical issue in SDPA with non-contiguous.

Test this issue in the same code. NPU can avoid this issue in torch>=2.1.0.

query_sdpa torch.Size([1, 1, 2048])
key_sdpa torch.Size([1, 16, 128])
value_sdpa torch.Size([1, 16, 128])
attention_mask_sdpa torch.Size([1, 1, 1, 16])
attention_mask_sdpa tensor([[[[True, True, True, True, True, True, True, True, True, True, True,
           True, True, True, True, True]]]])
---- non_contig_cpu_math
query contiguous True
key contiguous False
value contiguous False
---- contig_cpu_math
query contiguous True
key contiguous True
value contiguous True
---- non_contig_npu_math
query contiguous True
key contiguous False
value contiguous False
---- contig_npu_math
query contiguous True
key contiguous True
value contiguous True
---- non_contig_npu_memeff
query contiguous True
key contiguous False
value contiguous False
---- contig_npu_memeff
query contiguous True
key contiguous True
value contiguous True


cpu non-contig/contig: mean abs-diff tensor(0.)
cpu non-contig/contig: mean rel-diff tensor(0.)
npu non-contig/contig math: mean abs-diff tensor(0., device='npu:0')
npu non-contig/contig math: mean rel-diff tensor(0., device='npu:0')
npu non-contig/contig memeff: mean abs-diff tensor(0., device='npu:0')
npu non-contig/contig memeff: mean rel-diff tensor(0., device='npu:0')

Allclose CPU non-contig/contig: True
Allclose NPU math non-contig/contig: True
Allclose NPU memeff non-contig/contig: True

SunMarc

Nice thanks for the explanation !

zheliuyu · 2025-01-05T02:52:10Z

Nice thanks for the explanation !

@SunMarc @ArthurZucker
Hi, is this PR Okay to be merged ? Is there anything I can help?^^

SunMarc · 2025-01-06T11:19:39Z

Okay to be merged for me ! cc @ArthurZucker

HuggingFaceDocBuilderDev · 2025-01-06T11:45:30Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Sounds good

Co-authored-by: root <[email protected]>

zheliuyu changed the title ~~Ascend NPU support SDPA~~ NPU support SDPA Dec 14, 2024

zheliuyu marked this pull request as draft December 21, 2024 07:16

SunMarc reviewed Dec 23, 2024

View reviewed changes

zheliuyu closed this Dec 24, 2024

zheliuyu reopened this Dec 24, 2024

zheliuyu marked this pull request as ready for review December 24, 2024 08:06

NPU support SDPA

f16aa5d

SunMarc approved these changes Dec 24, 2024

View reviewed changes

SunMarc requested a review from ArthurZucker December 24, 2024 11:33

ArthurZucker approved these changes Jan 7, 2025

View reviewed changes

ArthurZucker merged commit ed73ae2 into huggingface:main Jan 7, 2025
25 checks passed

AlanPonnachan pushed a commit to AlanPonnachan/transformers that referenced this pull request Jan 7, 2025

NPU support SDPA (huggingface#35165)

0408ceb

Co-authored-by: root <[email protected]>

zhuo97 mentioned this pull request Jan 14, 2025

[WIP] support Ascend NPU backend OpenRLHF/OpenRLHF#605

Closed

20 tasks

zheliuyu mentioned this pull request Jan 22, 2025

tracker pt-ecosystem/tracking-map#1

Closed

This was referenced Feb 25, 2025

add recommendations for NPU using flash_attn #36383

Merged

Ascend NPU support SDPA #35130

Closed

This was referenced Mar 25, 2025

[RFC] [sub roadmap] [25Q2] Add Ascend NPU support for OpenRLHF OpenRLHF/OpenRLHF#914

Open

[RFC] [sub roadmap] [25Q2] Add Ascend NPU support for verl volcengine/verl#842

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NPU support SDPA #35165

NPU support SDPA #35165

Uh oh!

zheliuyu commented Dec 9, 2024

Uh oh!

SunMarc left a comment •

edited

Loading

Uh oh!

zheliuyu commented Dec 24, 2024 •

edited

Loading

Uh oh!

SunMarc left a comment

Uh oh!

zheliuyu commented Jan 5, 2025 •

edited

Loading

Uh oh!

SunMarc commented Jan 6, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jan 6, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

NPU support SDPA #35165

NPU support SDPA #35165

Uh oh!

Conversation

zheliuyu commented Dec 9, 2024

What does this PR do?

Before submitting

notes

Uh oh!

SunMarc left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zheliuyu commented Dec 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

NPU supports SDPA in torch>=2.1.0

NPU is OK with non-contiguous inputs in torch>=2.1.0

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

zheliuyu commented Jan 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SunMarc commented Jan 6, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jan 6, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

SunMarc left a comment •

edited

Loading

zheliuyu commented Dec 24, 2024 •

edited

Loading

zheliuyu commented Jan 5, 2025 •

edited

Loading