Skip to content

Conversation

@zhaoyang-star
Copy link
Collaborator

@zhaoyang-star zhaoyang-star commented Jun 28, 2021

【本PR工作】
对_特定情况_下的 conv2d_1x1 转换为 FC 计算,同时为了解决 input_channel 较大时单个线程需要遍历计算 input_channel 次乘累加操作,扩大了 4 倍线程数量,即将 input_channel 分成 4 部分,每个线程负责其中一部分的计算,然后 4 个线程通过 local memory 把中间乘累加结果再加在一起。

对比之前的方案,核心差异点:

  • gws 由 (output_channel / 4, 1, 1) 扩大为 (output_channel / 4, 4, 1),线程数量扩大了,input_channel 越大,加速效果越明显;
  • weights 存储方式由 image2d/half4 调整为 half16,一个线程线性读取连续的 16个 half 数据,推测访存速度可会提升。
conv2d_1x1 global work size local work size
original (output_channel / 4, 1, 1) tune找到最优
new (output_channel / 4, 4, 1) fixed (32, 4, 1)

【效果】
MobileNetV3_small_x1_0_infer 模型,其中 19 个 conv1x1 可以使用 FC 代替,模型整体加速比 和 kernel 加速比如下:
image

MobileNetV3_small_x1_0_infer kernel profiling on armv7 on 845
image

MobileNetV3_large_x1_0_infer 模型,其中 17 个 conv1x1 可以使用 FC 代替,模型整体加速比如下:
image

@paddle-bot-old
Copy link

Thanks for your contribution!

@zhaoyang-star zhaoyang-star requested a review from daming5432 June 29, 2021 02:46
daming5432
daming5432 previously approved these changes Jun 29, 2021
Copy link
Collaborator

@daming5432 daming5432 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@daming5432 daming5432 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhaoyang-star zhaoyang-star merged commit eea1e39 into PaddlePaddle:develop Jul 1, 2021
@zhaoyang-star zhaoyang-star deleted the use_fc branch July 1, 2021 01:00
@zhaoyang-star zhaoyang-star changed the title Use FC replace conv1x1 [OpenCL][Kernel] Use FC replace conv1x1 Jul 1, 2021
zhaoyang-star added a commit to zhaoyang-star/Paddle-Lite that referenced this pull request Jul 1, 2021
daming5432 pushed a commit that referenced this pull request Jul 1, 2021
* [OpenCL][Kernel] Use FC replace conv1x1 (#6365)

* test=develop
@zhaoyang-star zhaoyang-star mentioned this pull request Jul 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants