Skip to content

Conversation

@thunder95
Copy link
Contributor

提交argmin_argmax OP性能优化设计文档

@paddle-bot
Copy link

paddle-bot bot commented Sep 12, 2022

你的PR提交成功,感谢你对开源项目的贡献!
请检查PR提交格式和内容是否完备,具体请参考示例模版
Your PR has been submitted. Thanks for your contribution!
Please check its format and content. For this, you can refer to Template and Demo.


## 1.1 飞桨现状

当前性能如下表(基于PaddlePaddle develop分支):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里PaddlePaddle中间多余空格需要删掉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已去掉


## 1.3 对比分析

目前Paddle与Pytorch的API设计方案几乎相同, 且底层都使用了Cub库实现。
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个能解释为什么pytorch底层也用了cub但是性能差异这么大吗?如果使用reduce改写,预计性能提升4.5倍后,跟pytorch还是有比较大的差距

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZzSean 发现paddle和pytorch实现上除了cub外,还有其他细节有些差异,已补充rfc。如果有遗漏的地方,辛苦老师多指点一下。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZzSean 经测试fastdivmod并没有明显性能提升,block优化配置后有较明显提升但是离pytorch性能差距还比较大,重新研读了torch代码,发现新版torch底层用的reduce。

@thunder95 thunder95 requested a review from ZzSean September 17, 2022 12:51
[1]. [OP Benchmark使用指南](https://github.com/PaddlePaddle/benchmark/blob/master/api/README.md)


PPYDDDD111 No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是不是需要删掉啊

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

误写,已删除 @ZzSean

Copy link

@ZzSean ZzSean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants