Skip to content

Conversation

micmelesse
Copy link
Collaborator

@micmelesse micmelesse commented Sep 11, 2025

enable FA V3. It adds support for f8. We also add support for Paged Attention.

Build Flash attention v3

cd hooper
FLASH_ATTENTION_TRITON_AMD_ENABLE="TRUE" python setup.py install

You can use fp8 like this after installing flash attention v3. Make sure that the environment variable FLASH_ATTENTION_TRITON_AMD_ENABLE="TRUE" is set

from flash_attn_interface import flash_attn_func

out = flash_attn_func(
    q,
    k,
    v,
    q_descale=q_descale,
    k_descale=k_descale,
    v_descale=v_descale
)

If you want use paged attention with fp8 you can use the flash_attn_with_kvcache api

from flash_attn_interface import flash_attn_with_kvcache

out = flash_attn_with_kvcache(
    q,
    k_cache,
    v_cache,
    q_descale=q_descale,
    k_descale=k_descale,
    v_descale=v_descale,
    page_table=page_table,
)

narrow pa test

ref works on most cases

inplace ref with new_kv

inplace paged attention

add pa ref

save pa

basic  paged works

save

fix swa + causal in pa. Also new_kv only on pa path

passing

build fa v3

import interface from fa v3

copy fa tests

use v3 api

clean up

rename to match old test

support different head sizes

remove fp8

basisc passing v3 cases

test_flash_attn_varlen_output v3 working

isolate bad case for kvcache

case passing

save

use decode is seqused/ cacheseql is given

use decode if not varlen

basci kvcache v3 working

kvcache enable more cases

detect kvcache case if seqused_q is non and sequese_k is not None

skip failing test

find fp8 failing case

mha fp8 works

fix fp8 MQA/GQA bug

clean up

more clean up

clean up more

don't need fp8 dead code

remove train code with fp8 stuff

fp8 working in kvcache

paged + fp8 seems to be working

new_kv allowed
@micmelesse micmelesse changed the title Enable FA V3 with Paged Attention and FP8 Enable FA V3 Sep 11, 2025
@micmelesse
Copy link
Collaborator Author

fa v2 tests. increased tests to 100k

image

fa v2 bench

image

fa v3 tests.

image

@micmelesse micmelesse marked this pull request as ready for review September 18, 2025 14:26
@micmelesse micmelesse merged commit 465cb97 into main_perf Sep 18, 2025
@micmelesse micmelesse deleted the micmelesse/paged_attn branch September 26, 2025 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant