optimization of max_pool3d grad #45934

s5u13b · 2022-09-09T09:56:14Z

PR types

Performance optimization

PR changes

OPs

Describe

Environment:
- V100-32G, CUDA 11.2, cuDNN 8
Feature：
- Replace the div and mod operation with fast_divmod operation.
- Replace 1d gpu launch with 3d gpu launch.
- Optimize the computation logic of input grad accumulation. Before optimization, the gpu launch config is based on the input data, and the input grad is accumulated through traverse the coresponding index of output mask data, which introduces the overhead of much output index computation. After optimization, the gpu launch config is based on the output data, the input grad is directly accumulated through looking up the max index of each output data, which saves the overhead of output index computation but requires atomic add operation.
- (Config 0 is not optimized yet because paddle calls cudnn kernel in the config 0 of max_pool3d benchmarking.)
Performance (OP Benchmark):

Paddle Kernel	Config ID	Perf Before	Perf After	Improvement	Perf of Pytorch
cudnn::pooling_bw_5d_kernel_max	0	1779.7us	-	-	725.72us
KernelMaxPool3DWithIdxGrad	1	6128.1us	677.62us	804.3%	725.83us

JamesLim-sy

LGTM，补充下其他走相同Kernel的性能数据吧

optimization of max_pool3d grad

3e850d6

paddle-bot-old bot added the contributor External developers label Sep 9, 2022

luotao1 assigned luotao1, Ligoml and JamesLim-sy Sep 13, 2022

JamesLim-sy approved these changes Sep 20, 2022

View reviewed changes

JamesLim-sy merged commit 0e563da into PaddlePaddle:develop Sep 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

optimization of max_pool3d grad #45934

optimization of max_pool3d grad #45934

Uh oh!

s5u13b commented Sep 9, 2022 •

edited

Loading

Uh oh!

JamesLim-sy left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

optimization of max_pool3d grad #45934

optimization of max_pool3d grad #45934

Uh oh!

Conversation

s5u13b commented Sep 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR types

PR changes

Describe

Uh oh!

JamesLim-sy left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

s5u13b commented Sep 9, 2022 •

edited

Loading