Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
290 changes: 118 additions & 172 deletions src/BenchmarkDotNet/Detectors/Cpu/HardwareIntrinsics.cs
Original file line number Diff line number Diff line change
Expand Up @@ -11,38 +11,40 @@

namespace BenchmarkDotNet.Detectors.Cpu
{
// based on https://github.com/dotnet/runtime/tree/v10.0.0-rc.1.25451.107/src/coreclr/tools/Common/JitInterface/ThunkGenerator/InstructionSetDesc.txt
internal static class HardwareIntrinsics
{
internal static string GetVectorSize() => Vector.IsHardwareAccelerated ? $"VectorSize={Vector<byte>.Count * 8}" : string.Empty;

internal static string GetShortInfo()
{
if (IsX86Avx512FSupported)
return GetShortAvx512Representation();
if (IsX86Avx2Supported)
return "AVX2";
else if (IsX86AvxSupported)
return "AVX";
else if (IsX86Sse42Supported)
return "SSE4.2";
else if (IsX86Sse41Supported)
return "SSE4.1";
else if (IsX86Ssse3Supported)
return "SSSE3";
else if (IsX86Sse3Supported)
return "SSE3";
else if (IsX86Sse2Supported)
return "SSE2";
else if (IsX86SseSupported)
return "SSE";
else if (IsX86BaseSupported)
return "X86Base";
else if (IsArmAdvSimdSupported)
return "AdvSIMD";
if (IsX86BaseSupported)
{
if (IsX86Avx512Supported)
{
return "x86-64-v4";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the official short names for the ISA groupings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I am very happy about the fix and other improvements, I find myself hesitant about the new "display name":

  • most engineers know what AVX is, I am not sure about the ISA groupings (at least I was not familiar with those names so far)
  • the ISA groupings contain information that is already displayed on the screen: the architecture
BenchmarkDotNet v0.15.3-develop (2025-09-11), Windows 11 (10.0.26100.6584/24H2/2024Update/HudsonValley)
AMD Ryzen Threadripper PRO 3945WX 12-Cores 3.99GHz, 1 CPU, 24 logical and 12 physical cores
.NET SDK 10.0.100-preview.6.25358.103
-  [Host] : .NET 8.0.20 (8.0.20, 8.0.2025.41914), X64 RyuJIT AVX2
+  [Host] : .NET 8.0.20 (8.0.20, 8.0.2025.41914), X64 RyuJIT x86-64-v3

@AndreyAkinshin thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same concern: AVX2 is much more convinient for the users unlike x86-64-v3. Since we have enough space in this line, I'm suggesting to print both options.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in a bit of a disagreement here.

The names were formalized several years ago by Intel, AMD, LLVM, and others explicitly because it was becoming too unwieldy. Both to list all the options and because of the rapidly exploding number of combinations that could be specified. The manufacturers largely do not want developers thinking in terms of individual ISAs so much anymore and want them to essentially target "profiles" instead, for simplicity.

This also follows the general model that Arm64 and other manufacturers use as well. You declare your base profile (i.e. armv8.0-a) and then optionally list any key extensions on top (i.e. armv8.0-a + lse). So the intent is that you squash the complex profiles together (i.e. x86-64-v4) and then list key important extensions that aren't part of a defined profile on top (i.e. x86-64-v4 + APX).

This is likewise how we've changed the JIT to largely support the feature sets, how the JIT Dumps and JIT Disasm now print the information, and how the configuration knobs for manual disablement work.

We explicitly want developers to stop thinking in terms of "AVX2" or "AVX512" in the common cases, but they still have the ability to print the full ISA list if desired (via GetFullInfo)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no objections to using the new notation. However, I want us to maintain backward compatibility with user habits.

We explicitly want developers to stop thinking in terms of "AVX2" or "AVX512" in the common cases

BenchmarkDotNet does not have this goal. In this project, we want to provide a convenient user experience so that people can quickly gain insights from the information presented based on their existing knowledge. If we decide that we want to promote the new notation in addition to a convenient user experience, it makes even more sense to display both versions. This approach will naturally help people create a mental map without needing to search for details when they encounter unfamiliar labels. We can use the new notation in the first place, but then explicitly specify instruction sets inside.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn’t it fairly counterintuitive to go against what the hardware manufacturers, compiler developers, and underlying platform have all agreed is the best for long term UX?

This wasn’t some arbitrary decision, but the result of extensive discussion based around real world pain points and confusion caused by the rapidly expanding complexity of ISAs

———-

Part of the consideration is that typical users don’t understand what things like AVX2 are and those that roughly know often get it confused for meaning the wrong thing. It’s only the people that directly work with intrinsics that actually understand

This is very similar in concept to people understanding the brand name “Windows 8” or “Windows 11” but not understanding how that maps to version numbers (6.2 and 10.0.22000)

The profiles are meant to be explicitly clear and give a point that shows something as being meaningfully greater without getting into the technical nuance that is irrelevant to the typical case — I.e listing things that won’t actually impact codegen or make a meaningful difference; which is an actual “problem” with how BDN lists ISAs today, surfacing info that likely isn’t causing profiling differences by default to typical devs

Particularly with how .NET actually works, this also applies here and simplifies what devs should be considering without removing pertinent info. They see the different baselines we support and do codegen differentiation against, nothing more unless they ask for details

———-

I strongly expect that the distaste here is more an initial visceral reaction to the “cheese being moved” from devs that do have more technical knowledge of the space. Such a reaction would likely be less strong if you had less knowledge of the space or were more ingrained with the space and kept up to date with the best practices and recommendations for dealing with hardware profiles as they’ve evolved over the past 5-10 years (not just for x64 but also for Arm64 and other major cpu vendors who are all unifying on such approaches)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, you convinced me; let's try your way. =)
I still expect that it will cause confusion for some of our users, but it's probably not a big problem.

}
else if (IsX86Avx2Supported)
{
return "x86-64-v3";
}
else if (IsX86Sse42Supported)
{
return "x86-64-v2";
}
else
{
return "x86-64-v1";
}
}
else if (IsArmBaseSupported)
return "ArmBase";
{
return "armv8.0-a";
}
else
{
return GetVectorSize(); // Runtimes prior to .NET Core 3.0 (APIs did not exist so we print non-exact Vector info)
}
}

internal static string GetFullInfo(Platform platform)
Expand All @@ -55,32 +57,31 @@ static IEnumerable<string> GetCurrentProcessInstructionSets(Platform platform)
{
case Platform.X86:
case Platform.X64:

if (IsX86Avx512FSupported) yield return GetShortAvx512Representation();
else if (IsX86Avx2Supported) yield return "AVX2";
else if (IsX86AvxSupported) yield return "AVX";
else if (IsX86Sse42Supported) yield return "SSE4.2";
else if (IsX86Sse41Supported) yield return "SSE4.1";
else if (IsX86Ssse3Supported) yield return "SSSE3";
else if (IsX86Sse3Supported) yield return "SSE3";
else if (IsX86Sse2Supported) yield return "SSE2";
else if (IsX86SseSupported) yield return "SSE";
else if (IsX86BaseSupported) yield return "X86Base";

if (IsX86AesSupported) yield return "AES";
if (IsX86Bmi1Supported) yield return "BMI1";
if (IsX86Bmi2Supported) yield return "BMI2";
if (IsX86FmaSupported) yield return "FMA";
if (IsX86LzcntSupported) yield return "LZCNT";
if (IsX86PclmulqdqSupported) yield return "PCLMUL";
if (IsX86PopcntSupported) yield return "POPCNT";
{
if (IsX86Avx10v2Supported) yield return "AVX10v2";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prints the full set of ISAs supported based on the way we group them in RyuJIT

if (IsX86Avx10v1Supported)
{
yield return "AVX10v1";
yield return "AVX512 BF16+FP16";
}
if (IsX86Avx512v3Supported) yield return "AVX512 BITALG+VBMI2+VNNI+VPOPCNTDQ";
if (IsX86Avx512v2Supported) yield return "AVX512 IFMA+VBMI";
if (IsX86Avx512Supported) yield return "AVX512 F+BW+CD+DQ+VL";
if (IsX86Avx2Supported) yield return "AVX2+BMI1+BMI2+F16C+FMA+LZCNT+MOVBE";
if (IsX86AvxSupported) yield return "AVX";
if (IsX86Sse42Supported) yield return "SSE3+SSSE3+SSE4.1+SSE4.2+POPCNT";
if (IsX86BaseSupported) yield return "X86Base+SSE+SSE2";
if (IsX86AesSupported) yield return "AES+PCLMUL";
if (IsX86AvxVnniSupported) yield return "AvxVnni";
if (IsX86SerializeSupported) yield return "SERIALIZE";
// TODO: Add MOVBE when API is added.
break;
}
case Platform.Arm64:
if (IsArmAdvSimdSupported) yield return "AdvSIMD";
else if (IsArmBaseSupported) yield return "ArmBase";
{
if (IsArmBaseSupported)
{
yield return "ArmBase+AdvSimd";
}

if (IsArmAesSupported) yield return "AES";
if (IsArmCrc32Supported) yield return "CRC32";
Expand All @@ -89,71 +90,39 @@ static IEnumerable<string> GetCurrentProcessInstructionSets(Platform platform)
if (IsArmSha1Supported) yield return "SHA1";
if (IsArmSha256Supported) yield return "SHA256";
break;
}

default:
yield break;
}
}
}

private static string GetShortAvx512Representation()
{
StringBuilder avx512 = new("AVX-512F");
if (IsX86Avx512CDSupported) avx512.Append("+CD");
if (IsX86Avx512BWSupported) avx512.Append("+BW");
if (IsX86Avx512DQSupported) avx512.Append("+DQ");
if (IsX86Avx512FVLSupported) avx512.Append("+VL");
if (IsX86Avx512VbmiSupported) avx512.Append("+VBMI");

return avx512.ToString();
}

#pragma warning disable CA2252 // Some APIs require opting into preview features
internal static bool IsX86BaseSupported =>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are grouped based on the DOTNET_Enable* flags we expose so that we're only querying what can actually be toggled on/off.

#if NET6_0_OR_GREATER
X86Base.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.X86Base");
#endif

internal static bool IsX86SseSupported =>
#if NET6_0_OR_GREATER
Sse.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Sse");
#endif

internal static bool IsX86Sse2Supported =>
#if NET6_0_OR_GREATER
X86Base.IsSupported &&
Sse.IsSupported &&
Sse2.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.X86Base") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Sse") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Sse2");
#endif

internal static bool IsX86Sse3Supported =>
#if NET6_0_OR_GREATER
Sse3.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Sse3");
#endif

internal static bool IsX86Ssse3Supported =>
#if NET6_0_OR_GREATER
Ssse3.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Ssse3");
#endif

internal static bool IsX86Sse41Supported =>
#if NET6_0_OR_GREATER
Sse41.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Sse41");
#endif

internal static bool IsX86Sse42Supported =>
#if NET6_0_OR_GREATER
Sse42.IsSupported;
Sse3.IsSupported &&
Ssse3.IsSupported &&
Sse41.IsSupported &&
Sse42.IsSupported &&
Popcnt.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Sse42");
GetIsSupported("System.Runtime.Intrinsics.X86.Sse3") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Ssse3") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Sse41") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Sse42") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Popcnt");
#endif

internal static bool IsX86AvxSupported =>
Expand All @@ -165,107 +134,88 @@ private static string GetShortAvx512Representation()

internal static bool IsX86Avx2Supported =>
#if NET6_0_OR_GREATER
Avx2.IsSupported;
Avx2.IsSupported &&
Bmi1.IsSupported &&
Bmi2.IsSupported &&
Fma.IsSupported &&
Lzcnt.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Avx2");
#endif

internal static bool IsX86Avx512FSupported =>
#if NET8_0_OR_GREATER
Avx512F.IsSupported;
#else
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512F");
GetIsSupported("System.Runtime.Intrinsics.X86.Avx2") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Bmi1") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Bmi2") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Fma") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Lzcnt");
#endif

internal static bool IsX86Avx512FVLSupported =>
internal static bool IsX86Avx512Supported =>
#if NET8_0_OR_GREATER
Avx512F.VL.IsSupported;
Avx512F.IsSupported &&
Avx512F.VL.IsSupported &&
Avx512BW.IsSupported &&
Avx512BW.VL.IsSupported &&
Avx512CD.IsSupported &&
Avx512CD.VL.IsSupported &&
Avx512DQ.IsSupported &&
Avx512DQ.VL.IsSupported;
#else
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512F+VL");
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512F") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512F+VL") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512BW") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512BW+VL") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512CD") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512CD+VL") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512DQ") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512DQ+VL");
#endif

internal static bool IsX86Avx512BWSupported =>
internal static bool IsX86Avx512v2Supported =>
#if NET8_0_OR_GREATER
Avx512BW.IsSupported;
Avx512Vbmi.IsSupported &&
Avx512Vbmi.VL.IsSupported;
#else
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512BW");
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512Vbmi") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512Vbmi+VL");
#endif

internal static bool IsX86Avx512CDSupported =>
#if NET8_0_OR_GREATER
Avx512CD.IsSupported;
internal static bool IsX86Avx512v3Supported =>
#if NET10_0_OR_GREATER
Avx512Vbmi2.IsSupported &&
Avx512Vbmi2.VL.IsSupported;
#else
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512CD");
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512Vbmi2") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512Vbmi2+VL");
#endif

internal static bool IsX86Avx512DQSupported =>
#if NET8_0_OR_GREATER
Avx512DQ.IsSupported;
internal static bool IsX86Avx10v1Supported =>
#if NET9_0_OR_GREATER
Avx10v1.IsSupported &&
Avx10v1.V512.IsSupported;
#else
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512DQ");
GetIsSupported("System.Runtime.Intrinsics.X86.Avx10v1") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Avx10v1+V512");
#endif

internal static bool IsX86Avx512VbmiSupported =>
#if NET8_0_OR_GREATER
Avx512Vbmi.IsSupported;
internal static bool IsX86Avx10v2Supported =>
#if NET10_0_OR_GREATER
Avx10v2.IsSupported &&
Avx10v2.V512.IsSupported;
#else
GetIsSupported("System.Runtime.Intrinsics.X86.Avx512Vbmi");
GetIsSupported("System.Runtime.Intrinsics.X86.Avx10v2") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Avx10v2+V512");
#endif

internal static bool IsX86AesSupported =>
#if NET6_0_OR_GREATER
System.Runtime.Intrinsics.X86.Aes.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Aes");
#endif

internal static bool IsX86Bmi1Supported =>
#if NET6_0_OR_GREATER
Bmi1.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Bmi1");
#endif

internal static bool IsX86Bmi2Supported =>
#if NET6_0_OR_GREATER
Bmi2.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Bmi2");
#endif

internal static bool IsX86FmaSupported =>
#if NET6_0_OR_GREATER
Fma.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Fma");
#endif

internal static bool IsX86LzcntSupported =>
#if NET6_0_OR_GREATER
Lzcnt.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Lzcnt");
#endif

internal static bool IsX86PclmulqdqSupported =>
#if NET6_0_OR_GREATER
System.Runtime.Intrinsics.X86.Aes.IsSupported &&
Pclmulqdq.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Aes") &&
GetIsSupported("System.Runtime.Intrinsics.X86.Pclmulqdq");
#endif

internal static bool IsX86PopcntSupported =>
#if NET6_0_OR_GREATER
Popcnt.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.Popcnt");
#endif

internal static bool IsX86AvxVnniSupported =>
#if NET6_0_OR_GREATER
#pragma warning disable CA2252 // This API requires opting into preview features
AvxVnni.IsSupported;
#pragma warning restore CA2252 // This API requires opting into preview features
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.X86.AvxVnni");
#endif
Expand All @@ -279,15 +229,10 @@ private static string GetShortAvx512Representation()

internal static bool IsArmBaseSupported =>
#if NET6_0_OR_GREATER
ArmBase.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.Arm.ArmBase");
#endif

internal static bool IsArmAdvSimdSupported =>
#if NET6_0_OR_GREATER
ArmBase.IsSupported &&
AdvSimd.IsSupported;
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.Arm.ArmBase") &&
GetIsSupported("System.Runtime.Intrinsics.Arm.AdvSimd");
#endif

Expand Down Expand Up @@ -332,6 +277,7 @@ private static string GetShortAvx512Representation()
#elif NETSTANDARD
GetIsSupported("System.Runtime.Intrinsics.Arm.Sha256");
#endif
#pragma warning restore CA2252 // Some APIs require opting into preview features

private static bool GetIsSupported([DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicProperties)] string typeName)
{
Expand Down
Loading