Skip to content

Conversation

@lj970926
Copy link
Contributor

@lj970926 lj970926 commented Aug 5, 2024

PR Category

Performance Optimization

PR Types

Performance

Description

  1. XPU do not have calc-comm overlap for sharding V2 now, so we use a single large comm buffer to facilitate the bandwidth of BKCL.

@paddle-bot
Copy link

paddle-bot bot commented Aug 5, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@RuohengMa RuohengMa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@tizhou86 tizhou86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@QingshuChen QingshuChen merged commit 35b6775 into PaddlePaddle:develop Aug 6, 2024
@houj04 houj04 added the XPU label Sep 4, 2024
cqulilujia pushed a commit to cqulilujia/Paddle that referenced this pull request Sep 10, 2024
* [XPU] sharding V2 uses single comm buffer

* format code

* format code

* fix bugs

* add env control
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants