Skip to content

Conversation

@Xreki
Copy link
Contributor

@Xreki Xreki commented Feb 22, 2023

PR types

Performance optimization

PR changes

OPs

Describe

重组fused_gemm_epilogue算子的实现,提取出函数ComputeFusedGemmEpilogueForwardComputeFusedGemmEpilogueBackward,并应用于fused_gated_attention算子中,进一步将Gate Linear和Output Linear中的matmul+bias计算进行融合。

@Xreki Xreki force-pushed the op/fused_gate_attention_opt branch from 4ba359b to 0ae9d63 Compare February 22, 2023 04:08
@Xreki Xreki force-pushed the op/fused_gate_attention_opt branch from 0ae9d63 to 44761b4 Compare February 22, 2023 04:11
@Xreki Xreki force-pushed the op/fused_gate_attention_opt branch from 7565381 to 37451fd Compare February 24, 2023 06:12
@Xreki Xreki force-pushed the op/fused_gate_attention_opt branch from 37451fd to f51c61b Compare February 24, 2023 08:04
@Xreki Xreki requested review from JamesLim-sy and sneaxiy February 26, 2023 01:24
} else if (std::is_same<T, phi::dtype::bfloat16>::value) {
return CUDA_R_16BF;
#endif
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about throw error here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,下个PR加上。

@Xreki Xreki merged commit 57f6a46 into PaddlePaddle:develop Feb 26, 2023
@Xreki Xreki deleted the op/fused_gate_attention_opt branch February 26, 2023 01:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants