Skip to content

Conversation

@SylarTiaNII
Copy link
Contributor

PR types

Performance optimization

PR changes

Others

Description

Adapt Sequnece Parallel on LoRA. Gain 14% performance optimization on LlaMa2-13b by experiment.
Convergence comparison attached -- sequence_parallel enabled (gpu_lora_sp) versus pure tensor parallel (gpu_lora_mp).
cf6827350394e4d2766cca1b7594bd81

@paddle-bot
Copy link

paddle-bot bot commented Apr 8, 2024

Thanks for your contribution!

Copy link
Contributor

@lugimzzz lugimzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@gongel gongel merged commit 06c2fdc into PaddlePaddle:release/2.7 Apr 10, 2024
SylarTiaNII added a commit to SylarTiaNII/PaddleNLP that referenced this pull request Apr 22, 2024
wawltor pushed a commit that referenced this pull request Apr 23, 2024
)

* [Distributed] adapt sequence parallel on LoRA (#8235)

* [Distributed] [CustomDevices] adapt lora sp && polish MC2 APIs
JunnYu pushed a commit that referenced this pull request Apr 24, 2024
)

* [Distributed] adapt sequence parallel on LoRA (#8235)

* [Distributed] [CustomDevices] adapt lora sp && polish MC2 APIs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants