Skip to content

Conversation

@Deleter-D
Copy link
Contributor

PR Category

Custom Device

PR Types

Bug fixes

Description

card-85146

  1. paddle/phi/kernels/gpu/check_numerics_kernel.cu: HIP can now use the original CUDA Kernel, and there is no need to write a separate Kernel for HIP.
  2. paddle/phi/kernels/strings/gpu/strings_lower_upper_kernel.cu: Need to reduce the number of threads to 256 on ROCm platform.
  3. test/cpp/fluid/elementwise/test_elementwise_op_grad_grad.h: There are different standards for CUDA and HIP platforms, so the standards are unified.
  4. paddle/phi/kernels/gpu/group_norm_kernel.cu: When the block size is less than the warp size, WarpReduce will result in all zeros. It seems to be an internal problem of hipcub on DCU.

@paddle-bot
Copy link

paddle-bot bot commented Jul 4, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@qili93 qili93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qili93 qili93 merged commit 38afd03 into PaddlePaddle:develop Jul 5, 2024
@Deleter-D Deleter-D deleted the fix_dcu_ut branch July 5, 2024 02:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants