Skip to content

Conversation

@FeixLiu
Copy link
Contributor

@FeixLiu FeixLiu commented Jun 2, 2021

PR types

New features

PR changes

OPs

Describe

拆分PR: #32291
功能描述:

  1. lars optimizer对于fp16支持
  2. 对应单测的加入

在ResNet50模型下测的以下数据:
收敛性能:(V100)

Global batch size Acc Epoch to acc
512 * 8 = 4096 75.9 38
128 * 256 = 32768 75.9 55
256 * 256 = 65536 75.9 87

吞吐性能:(XMAN3.5 A100)

Global batch size pure fp16 Throughput 性能提升
208 * 1 = 208 F 2503 imgs/sec NA
208 * 1 = 208 T 2552 imgs/sec 2.0%
256 * 1 = 408 F 2598 imgs/sec NA
256 * 1 = 408 T 2635 imgs/sec 2.1%

add unitest for new lars op

update momentum op to add dim for master_out_p

add fp16 unitest for lars optimizer
@paddle-bot-old
Copy link

paddle-bot-old bot commented Jun 2, 2021

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@wangxicoding wangxicoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wangxicoding wangxicoding merged commit 4d805e6 into PaddlePaddle:develop Jun 3, 2021
@FeixLiu FeixLiu deleted the update_lars branch June 4, 2021 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants