Skip to content

Conversation

@A-nnonymous
Copy link
Contributor

@A-nnonymous A-nnonymous commented Feb 21, 2025

PR Category

CINN

PR Types

Performance

Description

Fix performance issue on gemm fusion in float32 datatype related to fused_gemm_epilogue_pass.
In SwinTransformer_base_patch4_window7_224 with bs=64 and float32 datatype, ips is improved over 1x (138->281) due to this change.

pcard-76996

@paddle-bot
Copy link

paddle-bot bot commented Feb 21, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

zyfncg
zyfncg previously approved these changes Feb 21, 2025
@zyfncg zyfncg merged commit 4907ed5 into PaddlePaddle:develop Feb 23, 2025
33 checks passed
Enigmatisms pushed a commit to Enigmatisms/Paddle that referenced this pull request Mar 6, 2025
…ed to fused_gemm_epilogue_pass (PaddlePaddle#71226)

* [CINN] Fix performance issue on gemm fusion in float32 due to fuse_gemm_epilogue_pass.

* Modified original fp32 unittest to fp16, in order to perform check.

* polish code

* Modified python pass unittest to perform proper checks.
YqGe585 pushed a commit to YqGe585/Paddle that referenced this pull request May 7, 2025
…ed to fused_gemm_epilogue_pass (PaddlePaddle#71226)

* [CINN] Fix performance issue on gemm fusion in float32 due to fuse_gemm_epilogue_pass.

* Modified original fp32 unittest to fp16, in order to perform check.

* polish code

* Modified python pass unittest to perform proper checks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants