tanh avx512 mask optimization #6096

lfalive · 2025-05-30T08:28:34Z

#6061

tencent-adm · 2025-05-30T08:28:48Z

All committers have signed the CLA.

codecov-commenter · 2025-05-30T08:41:40Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.70%. Comparing base (73d8500) to head (f5dcba6).
Report is 1 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6096      +/-   ##
==========================================
+ Coverage   95.59%   95.70%   +0.10%     
==========================================
  Files         827      827              
  Lines      270116   270122       +6     
==========================================
+ Hits       258226   258527     +301     
+ Misses      11890    11595     -295

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-05-30T08:49:35Z

The binary size change of libncnn.so (bytes)

architecture	base size	pr size
x86_64	16511232	16511232
armhf	7369820	7369820
aarch64	10775560	10775560

Copilot

Pull Request Overview

This PR optimizes the TanH function for x86 by introducing an AVX512 mask‐based remainder handling block to process elements that do not fill a complete 16-element vector. Key changes include:

Using AVX512 mask load/store instructions for the remainder elements.
Removing the previously nested SSE2/AVX preprocessor directives for this block.
Maintaining backwards compatibility with a fallback to SSE2/AVX code when AVX512F is not available.

Comments suppressed due to low confidence (1)

src/layer/x86/tanh_x86.cpp:53

[nitpick] Consider renaming 'remain' to 'remaining_elements' to enhance clarity.

const unsigned int remain = size - i;

src/layer/x86/tanh_x86.cpp

nihui · 2025-05-30T09:22:33Z

Thanks for your contribution !

tanh avx512 mask optimization

f5dcba6

github-actions bot added the x86 label May 30, 2025

nihui requested a review from Copilot May 30, 2025 08:53

Copilot AI reviewed May 30, 2025

View reviewed changes

src/layer/x86/tanh_x86.cpp Show resolved Hide resolved

nihui approved these changes May 30, 2025

View reviewed changes

nihui merged commit 7fd167f into Tencent:master May 30, 2025
79 of 81 checks passed

BrewTestBot mentioned this pull request Sep 16, 2025

ncnn 20250916 Homebrew/homebrew-core#243467

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tanh avx512 mask optimization #6096

tanh avx512 mask optimization #6096

Uh oh!

lfalive commented May 30, 2025

Uh oh!

tencent-adm commented May 30, 2025 •

edited

Loading

Uh oh!

codecov-commenter commented May 30, 2025 •

edited

Loading

Uh oh!

github-actions bot commented May 30, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

nihui commented May 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tanh avx512 mask optimization #6096

tanh avx512 mask optimization #6096

Uh oh!

Conversation

lfalive commented May 30, 2025

Uh oh!

tencent-adm commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented May 30, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

nihui commented May 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tencent-adm commented May 30, 2025 •

edited

Loading

codecov-commenter commented May 30, 2025 •

edited

Loading