Skip to content

Conversation

@lshpku
Copy link
Contributor

@lshpku lshpku commented Dec 30, 2024

PR Category

CINN

PR Types

Performance

Description

重构了ComputeAtReductionTactic的流程,新增对Elementwise的支持

该流程的核心函数为FindCandidateBlocks(block),该函数用于在计算图中寻找对于当前block而言可以合法进行ComputeAt、且有性能提升的候选block,分为4个步骤:

  1. 使用GetDependencyHarzardFreeBlocks找到所有与当前block没有依赖冲突的block
  2. 使用ControlFlowAllEqual筛选出其中控制流也与当前block相等的block
  3. 使用HasCommonLoad进一步筛选出其中与当前block有共同输入的block
  4. 根据优先级选出最合适的一个候选block

性能测试

子图 / 模型 提升
BatchNorm NCHW [128, 256, -112, -112] 57,978 us 47,506 us 22.0%
yolov3_darknet53_270e_coco_bs8_fp16 43.01 ips 53.52 ips 24.4%

Pcard-85711

@paddle-bot
Copy link

paddle-bot bot commented Dec 30, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@lshpku lshpku force-pushed the compute-at-support-elementwise branch 2 times, most recently from d62325a to 883cf3b Compare December 31, 2024 06:53
@lshpku lshpku changed the title [CINN] ComputeAtReductionTactic supports elementwise-to-reduce fusion [CINN] ComputeAt supports elementwise-to-reduce fusion Dec 31, 2024
@lshpku lshpku changed the title [CINN] ComputeAt supports elementwise-to-reduce fusion [CINN] ComputeAt supports loop fusion for Elementwise Dec 31, 2024
@lshpku lshpku force-pushed the compute-at-support-elementwise branch 3 times, most recently from ddae072 to 2e84aee Compare January 2, 2025 06:28
@lshpku lshpku closed this Jan 2, 2025
@lshpku lshpku reopened this Jan 2, 2025
@lshpku lshpku force-pushed the compute-at-support-elementwise branch from 2e84aee to 8057c9a Compare January 2, 2025 10:45
@lshpku lshpku force-pushed the compute-at-support-elementwise branch from 8057c9a to a33a9af Compare January 3, 2025 02:40
@lshpku lshpku merged commit 1a0b50a into PaddlePaddle:develop Jan 6, 2025
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants