Move the implementation of Monitor to managed in CoreCLR #118371

jkoritzinsky · 2025-08-05T01:37:17Z

This allows us to share more code with NativeAOT, reduce a decent amount of complexity in the runtime, and fixes a blocking issue for #117788

src/coreclr/vm/comsynchronizable.cpp

src/coreclr/System.Private.CoreLib/src/System/Threading/SyncTable.CoreCLR.cs

src/coreclr/vm/comsynchronizable.cpp

…er once to avoid racing. Furthermore, handle an uninitialized lock ID in ObjectHeader.CoreCLR.cs. There's a possibility that the first time the managed Lock type's thread id tracking is used is on releasing a lock that was upgraded to a fat lock.

…inlock for thin->fat lock upgrading

… point

jkoritzinsky · 2025-09-04T21:46:08Z

I've improved the uncontended thick-lock case to only 1.5ns slower and I've exhausted all of my ideas.

Method	Job	Toolchain	Mean	Error	StdDev	Ratio	RatioSD
Uncontended	DefaultJob	Default	15.17 ns	0.328 ns	0.291 ns	1.00	0.03
Uncontended	Job-QMWUGV	CoreRun	16.88 ns	0.155 ns	0.137 ns	1.11	0.02

At this point it's as fast as the uncontended thin-lock case (within measurement error).

Can I get another review pass?

jkoritzinsky · 2025-09-04T23:30:23Z

@MihuBot benchmark System.Collections.Concurrent

MihaZupan · 2025-09-05T01:04:49Z

System.Collections.Concurrent.AddRemoveFromDifferentThreads<Int32>.ConcurrentBag(Size: 2000000) got stuck spinning one core.

It does appear to be disabled for AOT already
https://github.com/dotnet/performance/blob/84d81aab28f1f50b3ac90231411e03e923d94278/src/benchmarks/micro/libraries/System.Collections/Concurrent/AddRemoveFromDifferentThreads.cs#L19-L20

Here's a core dump if that helps: https://1drv.ms/f/c/17a1c1fca6517cd3/EhG6thpcjJFAsd09HR6rxHwBEGOhuj-rKC_j_WKUg5P0rg

jkotas · 2025-09-05T02:44:54Z

System.Collections.Concurrent.AddRemoveFromDifferentThreads.ConcurrentBag(Size: 2000000) got stuck spinning one core.

Sounds like a bug that we are "porting" to coreclr now?

jkoritzinsky · 2025-09-05T19:55:48Z

I'll take a look at the failure. It's possible it's the same as the one in the PR checks here (which is new as of 2 days ago, so that's fun).

Otherwise, I'd bet that we need to introduce a Thread.Yield call somewhere in System.Threading.Lock where we used to yield in the AwareLock impl.

…ke how it's done in main.

jkoritzinsky · 2025-09-05T22:11:55Z

Looking at the linked issue, I think the NativeAOT failure was due to #67805 (linked from #66987). I also think that #73033 may have contributed to fixing it.

I think the failures from this PR's run of the benchmark were due to bugs in the lock-free algorithms I wrote (same cause as the PR failures).

…emented.

… false

jkotas · 2025-09-10T01:34:52Z

@MihuBot benchmark System.Collections.Concurrent

MihuBot · 2025-09-10T02:03:59Z

System.Collections.Concurrent.IsEmpty_String_

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-CXNRMC : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-OOWMKK : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Size	Mean	Error	Ratio	Allocated	Alloc Ratio
Dictionary	Main	0	60.3069 ns	0.3031 ns	1.00	-	NA
Dictionary	PR	0	67.6643 ns	0.0902 ns	1.12	-	NA

Queue	Main	0	1.7502 ns	0.0138 ns	1.00	-	NA
Queue	PR	0	2.1883 ns	0.0106 ns	1.25	-	NA

Stack	Main	0	0.0005 ns	0.0003 ns	?	-	?
Stack	PR	0	0.0007 ns	0.0011 ns	?	-	?

Bag	Main	0	6.4018 ns	0.0270 ns	1.00	-	NA
Bag	PR	0	7.2359 ns	0.0168 ns	1.13	-	NA

Dictionary	Main	512	2.9015 ns	0.0024 ns	1.00	-	NA
Dictionary	PR	512	2.9077 ns	0.0080 ns	1.00	-	NA

Queue	Main	512	1.3167 ns	0.0343 ns	1.00	-	NA
Queue	PR	512	1.2773 ns	0.0052 ns	0.97	-	NA

Stack	Main	512	0.0010 ns	0.0013 ns	?	-	?
Stack	PR	512	0.0006 ns	0.0004 ns	?	-	?

Bag	Main	512	5.9004 ns	0.0138 ns	1.00	-	NA
Bag	PR	512	6.4420 ns	0.0157 ns	1.09	-	NA

System.Collections.Concurrent.IsEmpty_Int32_

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-CXNRMC : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-OOWMKK : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Size	Mean	Error	Ratio	Allocated	Alloc Ratio
Dictionary	Main	0	60.3510 ns	0.4007 ns	1.00	-	NA
Dictionary	PR	0	66.1922 ns	0.0347 ns	1.10	-	NA

Queue	Main	0	2.0083 ns	0.0117 ns	1.00	-	NA
Queue	PR	0	2.2453 ns	0.0133 ns	1.12	-	NA

Stack	Main	0	0.0017 ns	0.0003 ns	1.03	-	NA
Stack	PR	0	0.0001 ns	0.0001 ns	0.06	-	NA

Bag	Main	0	3.6648 ns	0.0194 ns	1.00	-	NA
Bag	PR	0	3.6604 ns	0.0138 ns	1.00	-	NA

Dictionary	Main	512	2.9330 ns	0.0019 ns	1.00	-	NA
Dictionary	PR	512	3.0051 ns	0.0205 ns	1.02	-	NA

Queue	Main	512	1.2885 ns	0.0058 ns	1.00	-	NA
Queue	PR	512	1.2807 ns	0.0039 ns	0.99	-	NA

Stack	Main	512	0.0012 ns	0.0013 ns	?	-	?
Stack	PR	512	0.0000 ns	0.0001 ns	?	-	?

Bag	Main	512	3.0776 ns	0.0147 ns	1.00	-	NA
Bag	PR	512	3.0538 ns	0.0091 ns	0.99	-	NA

System.Collections.Concurrent.Count_String_

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-CXNRMC : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-OOWMKK : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Size	Mean	Error	Ratio	Allocated	Alloc Ratio
Dictionary	Main	512	58.478 ns	0.1171 ns	1.00	-	NA
Dictionary	PR	512	64.779 ns	0.0591 ns	1.11	-	NA

Queue	Main	512	2.578 ns	0.0140 ns	1.00	-	NA
Queue	PR	512	2.563 ns	0.0132 ns	0.99	-	NA

Queue_EnqueueCountDequeue	Main	512	13.289 ns	0.0679 ns	1.00	-	NA
Queue_EnqueueCountDequeue	PR	512	13.150 ns	0.0398 ns	0.99	-	NA

Stack	Main	512	565.748 ns	0.1018 ns	1.00	-	NA
Stack	PR	512	565.683 ns	0.2087 ns	1.00	-	NA

Bag	Main	512	17.590 ns	0.0718 ns	1.00	-	NA
Bag	PR	512	17.101 ns	0.0325 ns	0.97	-	NA

System.Collections.Concurrent.Count_Int32_

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-CXNRMC : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-OOWMKK : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Size	Mean	Error	Ratio	Allocated	Alloc Ratio
Dictionary	Main	512	58.177 ns	0.1263 ns	1.00	-	NA
Dictionary	PR	512	66.273 ns	0.0329 ns	1.14	-	NA

Queue	Main	512	2.406 ns	0.0267 ns	1.00	-	NA
Queue	PR	512	2.362 ns	0.0115 ns	0.98	-	NA

Queue_EnqueueCountDequeue	Main	512	11.228 ns	0.0710 ns	1.00	-	NA
Queue_EnqueueCountDequeue	PR	512	11.439 ns	0.0572 ns	1.02	-	NA

Stack	Main	512	566.394 ns	0.1977 ns	1.00	-	NA
Stack	PR	512	565.759 ns	0.2387 ns	1.00	-	NA

Bag	Main	512	15.840 ns	0.0750 ns	1.00	-	NA
Bag	PR	512	17.014 ns	0.0142 ns	1.07	-	NA

System.Collections.Concurrent.AddRemoveFromSameThreads_String_

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-EKTIQB : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-TCXCDL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  InvocationCount=1
IterationTime=250ms  MaxIterationCount=20  MaxWarmupIterationCount=10
MemoryRandomization=Default  MinIterationCount=15  MinWarmupIterationCount=6
UnrollFactor=1  WarmupCount=-1

Method	Toolchain	Size	Mean	Error	Ratio	Allocated	Alloc Ratio
ConcurrentBag	Main	2000000	234.7 ms	8.60 ms	1.00	1.51 KB	1.00
ConcurrentBag	PR	2000000	234.2 ms	10.32 ms	1.00	1.46 KB	0.97

ConcurrentStack	Main	2000000	106.6 ms	7.86 ms	1.01	125000.68 KB	1.00
ConcurrentStack	PR	2000000	118.6 ms	3.58 ms	1.12	125000.63 KB	1.00

ConcurrentQueue	Main	2000000	362.3 ms	13.26 ms	1.00	32.77 KB	1.00
ConcurrentQueue	PR	2000000	348.0 ms	12.36 ms	0.96	16.51 KB	0.50

System.Collections.Concurrent.AddRemoveFromSameThreads_Int32_

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-EKTIQB : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-TCXCDL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  InvocationCount=1
IterationTime=250ms  MaxIterationCount=20  MaxWarmupIterationCount=10
MemoryRandomization=Default  MinIterationCount=15  MinWarmupIterationCount=6
UnrollFactor=1  WarmupCount=-1

Method	Toolchain	Size	Mean	Error	Ratio	Allocated	Alloc Ratio
ConcurrentBag	Main	2000000	162.77 ms	20.699 ms	1.02	1.23 KB	1.00
ConcurrentBag	PR	2000000	173.14 ms	11.390 ms	1.09	1.18 KB	0.96

ConcurrentStack	Main	2000000	74.98 ms	4.917 ms	1.01	125000.4 KB	1.00
ConcurrentStack	PR	2000000	75.81 ms	7.255 ms	1.02	125000.91 KB	1.00

ConcurrentQueue	Main	2000000	344.56 ms	9.666 ms	1.00	33.55 KB	1.00
ConcurrentQueue	PR	2000000	346.67 ms	6.745 ms	1.01	33.77 KB	1.01

System.Collections.Concurrent.AddRemoveFromDifferentThreads_String_

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-EKTIQB : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-TCXCDL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  InvocationCount=1
IterationTime=250ms  MaxIterationCount=20  MaxWarmupIterationCount=10
MemoryRandomization=Default  MinIterationCount=15  MinWarmupIterationCount=6
UnrollFactor=1  WarmupCount=-1

Method	Toolchain	Size	Mean	Error	Ratio	Allocated	Alloc Ratio
ConcurrentBag	Main	2000000	173.84 ms	23.256 ms	1.03	32 MB	1.00
ConcurrentBag	PR	2000000	186.80 ms	20.239 ms	1.10	32 MB	1.00

ConcurrentStack	Main	2000000	65.35 ms	8.207 ms	1.02	61.04 MB	1.00
ConcurrentStack	PR	2000000	62.36 ms	10.825 ms	0.97	61.04 MB	1.00

ConcurrentQueue	Main	2000000	66.13 ms	11.784 ms	1.06	32 MB	1.00
ConcurrentQueue	PR	2000000	46.68 ms	15.384 ms	0.75	8 MB	0.25

System.Collections.Concurrent.AddRemoveFromDifferentThreads_Int32_

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-EKTIQB : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-TCXCDL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  InvocationCount=1
IterationTime=250ms  MaxIterationCount=20  MaxWarmupIterationCount=10
MemoryRandomization=Default  MinIterationCount=15  MinWarmupIterationCount=6
UnrollFactor=1  WarmupCount=-1

Method	Toolchain	Size	Mean	Error	Ratio	Allocated	Alloc Ratio
ConcurrentBag	Main	2000000	172.54 ms	27.029 ms	1.04	16 MB	1.00
ConcurrentBag	PR	2000000	174.39 ms	28.161 ms	1.05	16 MB	1.00

ConcurrentStack	Main	2000000	61.98 ms	6.693 ms	1.02	61.04 MB	1.00
ConcurrentStack	PR	2000000	53.91 ms	7.852 ms	0.88	61.04 MB	1.00

ConcurrentQueue	Main	2000000	37.24 ms	13.043 ms	1.19	4 MB	1.00
ConcurrentQueue	PR	2000000	43.85 ms	12.724 ms	1.40	2 MB	0.50

jkotas · 2025-09-10T03:42:04Z

@MihuBot benchmark System.Threading

MihuBot · 2025-09-10T04:26:15Z

System.Threading.Tests.Perf_Volatile

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-BOPHZX : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-BLODRC : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
Write_double	Main	0.0006 ns	0.0003 ns	1.22	-	NA
Write_double	PR	0.0000 ns	0.0001 ns	0.06	-	NA

Read_double	Main	0.0000 ns	0.0000 ns	?	-	?
Read_double	PR	0.0000 ns	0.0000 ns	?	-	?

System.Threading.Tests.Perf_Timer

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-BOPHZX : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-BLODRC : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
ShortScheduleAndDispose	Main	82.86 ns	0.914 ns	1.00	120 B	1.00
ShortScheduleAndDispose	PR	84.02 ns	0.626 ns	1.01	120 B	1.00

LongScheduleAndDispose	Main	85.39 ns	0.774 ns	1.00	120 B	1.00
LongScheduleAndDispose	PR	83.42 ns	0.884 ns	0.98	120 B	1.00

ScheduleManyThenDisposeMany	Main	248,429,539.45 ns	4,893,236.838 ns	1.00	144001192 B	1.00
ScheduleManyThenDisposeMany	PR	250,414,633.12 ns	4,806,916.668 ns	1.01	144000840 B	1.00

ShortScheduleAndDisposeWithFiringTimers	Main	94.40 ns	3.273 ns	1.00	144 B	1.00
ShortScheduleAndDisposeWithFiringTimers	PR	94.41 ns	3.250 ns	1.00	144 B	1.00

SynchronousContention	Main	1,419,683,585.36 ns	8,332,486.154 ns	1.00	1152001384 B	1.00
SynchronousContention	PR	1,473,812,010.27 ns	15,917,775.117 ns	1.04	1152001744 B	1.00

AsynchronousContention	Main	1,121,848,992.35 ns	29,153,217.234 ns	1.00	1152002568 B	1.00
AsynchronousContention	PR	1,084,313,660.75 ns	12,161,717.337 ns	0.97	1152002648 B	1.00

System.Threading.Tests.Perf_ThreadStatic

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
GetThreadStatic	Main	2.231 ns	0.1305 ns	1.00	-	NA
GetThreadStatic	PR	1.347 ns	0.0021 ns	0.61	-	NA

SetThreadStatic	Main	2.730 ns	0.0020 ns	1.00	-	NA
SetThreadStatic	PR	2.981 ns	0.0005 ns	1.09	-	NA

System.Threading.Tests.Perf_ThreadPool

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1  Gen0=38000.0000

Method	Toolchain	WorkItemsPerCore	Mean	Error	Ratio	Allocated	Alloc Ratio
QueueUserWorkItem_WaitCallback_Throughput	Main	20000000	2.164 s	0.0081 s	1.00	610.35 MB	1.00
QueueUserWorkItem_WaitCallback_Throughput	PR	20000000	2.177 s	0.0206 s	1.01	610.35 MB	1.00

System.Threading.Tests.Perf_Thread

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-BOPHZX : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-BLODRC : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
CurrentThread	Main	1.926 ns	0.0010 ns	1.00	-	NA
CurrentThread	PR	2.053 ns	0.1157 ns	1.07	-	NA

GetCurrentProcessorId	Main	1.638 ns	0.0050 ns	1.00	-	NA
GetCurrentProcessorId	PR	1.637 ns	0.0004 ns	1.00	-	NA

System.Threading.Tests.Perf_SpinLock

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
EnterExit	Main	2.9704 ns	0.0009 ns	1.00	-	NA
EnterExit	PR	2.9755 ns	0.0064 ns	1.00	-	NA

TryEnterExit	Main	2.9753 ns	0.0029 ns	1.00	-	NA
TryEnterExit	PR	2.9738 ns	0.0044 ns	1.00	-	NA

TryEnter_Fail	Main	0.9914 ns	0.0007 ns	1.00	-	NA
TryEnter_Fail	PR	0.9900 ns	0.0006 ns	1.00	-	NA

System.Threading.Tests.Perf_SemaphoreSlim

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
ReleaseWait	Main	21.47 ns	0.022 ns	1.00	-	NA
ReleaseWait	PR	23.81 ns	0.121 ns	1.11	-	NA

ReleaseWaitAsync	Main	21.60 ns	0.031 ns	1.00	-	NA
ReleaseWaitAsync	PR	23.34 ns	0.106 ns	1.08	-	NA

ReleaseWaitAsync_WithCancellationToken	Main	769.73 ns	7.134 ns	1.00	376 B	1.00
ReleaseWaitAsync_WithCancellationToken	PR	735.11 ns	11.638 ns	0.96	376 B	1.00

ReleaseWaitAsync_WithTimeout	Main	790.34 ns	14.895 ns	1.00	472 B	1.00
ReleaseWaitAsync_WithTimeout	PR	789.72 ns	10.004 ns	1.00	472 B	1.00

ReleaseWaitAsync_WithCancellationTokenAndTimeout	Main	847.20 ns	7.681 ns	1.00	472 B	1.00
ReleaseWaitAsync_WithCancellationTokenAndTimeout	PR	871.74 ns	13.513 ns	1.03	472 B	1.00

System.Threading.Tests.Perf_Monitor

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
EnterExit	Main	6.799 ns	0.0034 ns	1.00	-	NA
EnterExit	PR	6.621 ns	0.0081 ns	0.97	-	NA

TryEnterExit	Main	7.362 ns	0.0375 ns	1.00	-	NA
TryEnterExit	PR	6.636 ns	0.0161 ns	0.90	-	NA

System.Threading.Tests.Perf_Lock

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
ReaderWriterLockSlimPerf	Main	9.731 ns	0.0306 ns	1.00	-	NA
ReaderWriterLockSlimPerf	PR	10.614 ns	0.0248 ns	1.09	-	NA

System.Threading.Tests.Perf_Interlocked

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
Increment_int	Main	0.6323 ns	0.0022 ns	1.00	-	NA
Increment_int	PR	0.6321 ns	0.0035 ns	1.00	-	NA

Decrement_int	Main	0.6330 ns	0.0031 ns	1.00	-	NA
Decrement_int	PR	0.6344 ns	0.0016 ns	1.00	-	NA

Increment_long	Main	0.6380 ns	0.0025 ns	1.00	-	NA
Increment_long	PR	0.6380 ns	0.0020 ns	1.00	-	NA

Decrement_long	Main	0.6318 ns	0.0040 ns	1.00	-	NA
Decrement_long	PR	0.6342 ns	0.0030 ns	1.00	-	NA

Add_int	Main	0.6317 ns	0.0021 ns	1.00	-	NA
Add_int	PR	0.6318 ns	0.0033 ns	1.00	-	NA

Add_long	Main	0.6315 ns	0.0056 ns	1.00	-	NA
Add_long	PR	0.6331 ns	0.0028 ns	1.00	-	NA

Exchange_int	Main	0.7081 ns	0.0005 ns	1.00	-	NA
Exchange_int	PR	0.7082 ns	0.0009 ns	1.00	-	NA

Exchange_long	Main	0.7081 ns	0.0005 ns	1.00	-	NA
Exchange_long	PR	0.7077 ns	0.0003 ns	1.00	-	NA

CompareExchange_int	Main	0.9255 ns	0.0019 ns	1.00	-	NA
CompareExchange_int	PR	0.9247 ns	0.0006 ns	1.00	-	NA

CompareExchange_long	Main	0.9257 ns	0.0009 ns	1.00	-	NA
CompareExchange_long	PR	0.9247 ns	0.0019 ns	1.00	-	NA

CompareExchange_object_Match	Main	1.3158 ns	0.1043 ns	1.01	-	NA
CompareExchange_object_Match	PR	0.9464 ns	0.1192 ns	0.73	-	NA

CompareExchange_object_NoMatch	Main	1.1155 ns	0.0082 ns	1.00	-	NA
CompareExchange_object_NoMatch	PR	1.2995 ns	0.1546 ns	1.16	-	NA

System.Threading.Tests.Perf_EventWaitHandle

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1  Median=153.7 ns

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
Set_Reset	Main	153.7 ns	0.35 ns	1.00	-	NA
Set_Reset	PR	153.5 ns	0.41 ns	1.00	-	NA

System.Threading.Tests.Perf_CancellationToken

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-BOPHZX : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-BLODRC : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
RegisterAndUnregister_Serial	Main	24.722 ns	0.4269 ns	1.00	-	NA
RegisterAndUnregister_Serial	PR	21.982 ns	0.5715 ns	0.89	-	NA

Cancel	Main	45.318 ns	0.4535 ns	1.00	192 B	1.00
Cancel	PR	46.471 ns	0.3645 ns	1.03	192 B	1.00

CreateLinkedTokenSource1	Main	28.261 ns	0.5563 ns	1.00	64 B	1.00
CreateLinkedTokenSource1	PR	27.548 ns	0.6255 ns	0.98	64 B	1.00

CreateLinkedTokenSource2	Main	44.228 ns	0.8997 ns	1.00	80 B	1.00
CreateLinkedTokenSource2	PR	44.838 ns	0.9604 ns	1.01	80 B	1.00

CreateLinkedTokenSource3	Main	72.391 ns	0.5782 ns	1.00	128 B	1.00
CreateLinkedTokenSource3	PR	72.472 ns	1.0065 ns	1.00	128 B	1.00

CreateTokenDispose	Main	6.282 ns	0.0123 ns	1.00	48 B	1.00
CreateTokenDispose	PR	5.993 ns	0.0109 ns	0.95	48 B	1.00

CreateRegisterDispose	Main	38.807 ns	0.1966 ns	1.00	192 B	1.00
CreateRegisterDispose	PR	38.616 ns	0.1736 ns	1.00	192 B	1.00

CreateManyRegisterDispose	Main	12.571 ns	0.0575 ns	1.00	-	NA
CreateManyRegisterDispose	PR	12.797 ns	0.1724 ns	1.02	-	NA

CreateManyRegisterMultipleDispose	Main	95.185 ns	6.9869 ns	1.01	-	NA
CreateManyRegisterMultipleDispose	PR	95.458 ns	6.9062 ns	1.01	-	NA

CancelAfter	Main	59.553 ns	0.6270 ns	1.00	144 B	1.00
CancelAfter	PR	57.732 ns	0.8863 ns	0.97	144 B	1.00

System.Threading.Tasks.Tests.Perf_AsyncMethods

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
EmptyAsyncMethodInvocation	Main	5.055 ns	0.0462 ns	1.00	-	NA
EmptyAsyncMethodInvocation	PR	5.032 ns	0.0071 ns	1.00	-	NA

SingleYieldMethodInvocation	Main	356.205 ns	4.1456 ns	1.00	96 B	1.00
SingleYieldMethodInvocation	PR	354.236 ns	2.2640 ns	0.99	96 B	1.00

Yield	Main	176.770 ns	1.1851 ns	1.00	-	NA
Yield	PR	180.182 ns	1.5086 ns	1.02	-	NA

System.Threading.Tasks.ValueTaskPerfTest

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-NIYAAS : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-HPQHWZ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-FQHDCZ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-VNBZDK : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=20
MaxWarmupIterationCount=10  MinIterationCount=15  MinWarmupIterationCount=2
WarmupCount=-1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
Await_FromResult	Main	5.312 ns	0.0101 ns	1.00	-	NA
Await_FromResult	PR	5.427 ns	0.0080 ns	1.02	-	NA

Await_FromCompletedTask	Main	11.415 ns	0.0515 ns	1.00	72 B	1.00
Await_FromCompletedTask	PR	11.822 ns	0.0382 ns	1.04	72 B	1.00

Await_FromCompletedValueTaskSource	Main	18.028 ns	0.0472 ns	1.00	72 B	1.00
Await_FromCompletedValueTaskSource	PR	17.061 ns	0.0455 ns	0.95	72 B	1.00

CreateAndAwait_FromResult	Main	5.339 ns	0.0066 ns	1.00	-	NA
CreateAndAwait_FromResult	PR	5.456 ns	0.0076 ns	1.02	-	NA

CreateAndAwait_FromResult_ConfigureAwait	Main	5.312 ns	0.0070 ns	1.00	-	NA
CreateAndAwait_FromResult_ConfigureAwait	PR	5.320 ns	0.0288 ns	1.00	-	NA

CreateAndAwait_FromCompletedTask	Main	7.039 ns	0.0104 ns	1.00	-	NA
CreateAndAwait_FromCompletedTask	PR	7.264 ns	0.0055 ns	1.03	-	NA

CreateAndAwait_FromCompletedTask_ConfigureAwait	Main	7.924 ns	0.0175 ns	1.00	-	NA
CreateAndAwait_FromCompletedTask_ConfigureAwait	PR	8.186 ns	0.0067 ns	1.03	-	NA

CreateAndAwait_FromCompletedValueTaskSource	Main	8.413 ns	0.0199 ns	1.00	-	NA
CreateAndAwait_FromCompletedValueTaskSource	PR	8.418 ns	0.0171 ns	1.00	-	NA

CreateAndAwait_FromYieldingAsyncMethod	Main	539.190 ns	15.0818 ns	1.00	208 B	1.00
CreateAndAwait_FromYieldingAsyncMethod	PR	518.297 ns	28.7200 ns	0.96	208 B	1.00

CreateAndAwait_FromDelayedTCS	Main	83.099 ns	0.3351 ns	1.00	216 B	1.00
CreateAndAwait_FromDelayedTCS	PR	83.483 ns	0.3325 ns	1.00	216 B	1.00

Copy_PassAsArgumentAndReturn_FromResult	Main	1.961 ns	0.0021 ns	1.00	-	NA
Copy_PassAsArgumentAndReturn_FromResult	PR	1.961 ns	0.0024 ns	1.00	-	NA

Copy_PassAsArgumentAndReturn_FromTask	Main	3.046 ns	0.0037 ns	1.00	-	NA
Copy_PassAsArgumentAndReturn_FromTask	PR	3.045 ns	0.0044 ns	1.00	-	NA

Copy_PassAsArgumentAndReturn_FromValueTaskSource	Main	6.780 ns	0.0047 ns	1.00	-	NA
Copy_PassAsArgumentAndReturn_FromValueTaskSource	PR	6.776 ns	0.0029 ns	1.00	-	NA

CreateAndAwait_FromCompletedValueTaskSource_ConfigureAwait	Main	11.717 ns	0.0231 ns	1.00	-	NA
CreateAndAwait_FromCompletedValueTaskSource_ConfigureAwait	PR	11.751 ns	0.0497 ns	1.00	-	NA

System.Threading.Channels.Tests.UnboundedChannelPerfTests

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
TryWriteThenTryRead	Main	18.70 ns	0.033 ns	1.00	-	NA
TryWriteThenTryRead	PR	19.91 ns	0.015 ns	1.07	-	NA

WriteAsyncThenReadAsync	Main	26.03 ns	0.018 ns	1.00	-	NA
WriteAsyncThenReadAsync	PR	26.30 ns	0.031 ns	1.01	-	NA

ReadAsyncThenWriteAsync	Main	42.82 ns	0.090 ns	1.00	-	NA
ReadAsyncThenWriteAsync	PR	44.77 ns	0.073 ns	1.05	-	NA

PingPong	Main	7,263,334.83 ns	124,832.259 ns	1.00	1010 B	1.00
PingPong	PR	7,341,565.94 ns	138,260.388 ns	1.01	1140 B	1.13

System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
TryWriteThenTryRead	Main	20.29 ns	0.128 ns	1.00	-	NA
TryWriteThenTryRead	PR	20.68 ns	0.103 ns	1.02	-	NA

WriteAsyncThenReadAsync	Main	31.46 ns	0.106 ns	1.00	-	NA
WriteAsyncThenReadAsync	PR	31.48 ns	0.028 ns	1.00	-	NA

ReadAsyncThenWriteAsync	Main	39.89 ns	0.262 ns	1.00	-	NA
ReadAsyncThenWriteAsync	PR	42.03 ns	0.038 ns	1.05	-	NA

PingPong	Main	7,000,869.83 ns	45,956.310 ns	1.00	996 B	1.00
PingPong	PR	7,385,030.19 ns	144,106.706 ns	1.05	1140 B	1.14

System.Threading.Channels.Tests.BoundedChannelPerfTests

BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-TLUDJO : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-QKIVUI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1

Method	Toolchain	Mean	Error	Ratio	Allocated	Alloc Ratio
TryWriteThenTryRead	Main	27.73 ns	0.033 ns	1.00	-	NA
TryWriteThenTryRead	PR	30.12 ns	0.061 ns	1.09	-	NA

WriteAsyncThenReadAsync	Main	35.47 ns	0.196 ns	1.00	-	NA
WriteAsyncThenReadAsync	PR	36.46 ns	0.134 ns	1.03	-	NA

ReadAsyncThenWriteAsync	Main	38.84 ns	0.173 ns	1.00	-	NA
ReadAsyncThenWriteAsync	PR	45.22 ns	0.032 ns	1.16	-	NA

PingPong	Main	7,189,249.38 ns	49,509.942 ns	1.00	1014 B	1.00
PingPong	PR	7,537,659.60 ns	121,117.126 ns	1.05	1123 B	1.11

jkoritzinsky · 2025-09-10T22:52:33Z

I ran these benchmarks locally and used the perf team's ResultsComparer tooling (with their recommended 2% threshold) and got the following results:

summary:
better: 23, geomean: 1.099
worse: 11, geomean: 1.093
total diff: 34

Slower	diff/base	Base Median (ns)	Diff Median (ns)	Modality
System.Threading.Tasks.Tests.Perf_AsyncMethods.Yield	1.43	157.16	224.16
System.Threading.Channels.Tests.UnboundedChannelPerfTests.TryWriteThenTryRead	1.14	29.64	33.64
System.Collections.Concurrent.AddRemoveFromDifferentThreads.ConcurrentSt	1.11	89349750.00	98731550.00	several?
System.Threading.Tasks.ValueTaskPerfTest.CreateAndAwait_FromResult_ConfigureAwai	1.07	5.32	5.70
System.Threading.Tests.Perf_Timer.AsynchronousContention	1.07	4311523100.00	4596094950.00
System.Threading.Tasks.ValueTaskPerfTest.CreateAndAwait_FromCompletedValueTaskSo	1.06	8.60	9.14
System.Threading.Channels.Tests.BoundedChannelPerfTests.ReadAsyncThenWriteAsync	1.06	60.48	63.94	bimodal
System.Threading.Tasks.Tests.Perf_AsyncMethods.SingleYieldMethodInvocation	1.05	496.56	519.55
System.Threading.Tasks.ValueTaskPerfTest.Await_FromCompletedTask	1.04	12.26	12.80
System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests.WriteAsyncThenRead	1.04	31.39	32.50
System.Threading.Channels.Tests.BoundedChannelPerfTests.WriteAsyncThenReadAsync	1.03	46.94	48.23

Faster	base/diff	Base Median (ns)	Diff Median (ns)	Modality
System.Collections.Concurrent.IsEmpty.Bag(Size: 512)	1.29	7.57	5.87
System.Collections.Concurrent.IsEmpty.Queue(Size: 0)	1.22	2.31	1.90
System.Collections.Concurrent.IsEmpty.Dictionary(Size: 512)	1.19	3.11	2.61	bimodal
System.Threading.Tests.Perf_ThreadStatic.SetThreadStatic	1.18	2.20	1.86
System.Threading.Tasks.ValueTaskPerfTest.Copy_PassAsArgumentAndReturn_FromResult	1.16	2.36	2.03
System.Collections.Concurrent.IsEmpty.Bag(Size: 0)	1.16	7.17	6.18
System.Threading.Tests.Perf_SemaphoreSlim.ReleaseWait	1.11	35.54	31.97
System.Threading.Tasks.Tests.Perf_AsyncMethods.EmptyAsyncMethodInvocation	1.10	5.17	4.70
System.Threading.Tests.Perf_SemaphoreSlim.ReleaseWaitAsync_WithCancellationToken	1.09	1570.68	1435.42
System.Threading.Tests.Perf_Timer.SynchronousContention	1.08	3671500800.00	3396126000.00
System.Threading.Channels.Tests.BoundedChannelPerfTests.PingPong	1.08	13669746.15	12649100.00
System.Threading.Tests.Perf_Monitor.TryEnterExit	1.08	14.16	13.16
System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests.TryWriteThenTryRea	1.07	21.01	19.56
System.Threading.Tasks.ValueTaskPerfTest.CreateAndAwait_FromResult	1.07	6.04	5.64
System.Threading.Channels.Tests.UnboundedChannelPerfTests.PingPong	1.07	12876008.33	12026150.00
System.Threading.Tasks.ValueTaskPerfTest.CreateAndAwait_FromCompletedValueTaskSo	1.06	14.69	13.84
System.Threading.Tests.Perf_Monitor.EnterExit	1.05	13.85	13.13
System.Collections.Concurrent.Count.Queue_EnqueueCountDequeue(Size: 512)	1.05	17.20	16.32
System.Collections.Concurrent.Count.Bag(Size: 512)	1.05	32.46	30.81
System.Collections.Concurrent.Count.Bag(Size: 512)	1.04	32.17	30.88
System.Threading.Tests.Perf_Timer.ScheduleManyThenDisposeMany	1.04	487751500.00	470460600.00
System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests.ReadAsyncThenWrite	1.03	61.28	59.30
System.Collections.Concurrent.Count.Stack(Size: 512)	1.03	478.25	464.98

The Task and ValueTask ones seem unrelated, and the only ones that are consistently slower are the Channels ones.

I'm not sure if this is due to different core counts, the fact that MihuBot runs on cloud VMs, or the natural instability of multithreading benchmarks.

jkotas · 2025-09-14T19:07:45Z

/azp run runtime-nativeaot-outerloop, runtime-coreclr outerloop, runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

azure-pipelines · 2025-09-14T19:08:15Z

Azure Pipelines successfully started running 4 pipeline(s).

jkoritzinsky added this to the 11.0.0 milestone Aug 5, 2025

jkoritzinsky added the area-VM-coreclr label Aug 5, 2025

dotnet-policy-service bot assigned jkoritzinsky Aug 5, 2025

AaronRobinsonMSFT reviewed Aug 5, 2025

View reviewed changes

src/coreclr/vm/comsynchronizable.cpp Outdated Show resolved Hide resolved

jkotas reviewed Aug 8, 2025

View reviewed changes

src/coreclr/System.Private.CoreLib/src/System/Threading/SyncTable.CoreCLR.cs Outdated Show resolved Hide resolved

jkotas mentioned this pull request Aug 8, 2025

GC hangs indefinitely on LongRunning #115794

Open

1 task

jkotas reviewed Aug 15, 2025

View reviewed changes

src/coreclr/vm/comsynchronizable.cpp Outdated Show resolved Hide resolved

build-analysis bot mentioned this pull request Aug 16, 2025

System.IO.FileSystem.Tests.WorkItemExecution CI timeout on net9.0-osx-Debug-x64-Mono_Minijit_Debug-OSX.1200.Amd64.Open #101423

Open

jkoritzinsky added 15 commits August 21, 2025 08:53

WIP: Move Monitor impl from unmanaged code to managed in CoreCLR

945e059

Clean up the build

424d0bb

Unify Condition implementation

4d24c00

Fix Mono and CoreCLR builds

5c0fc6d

Assert index for sync block

1e75f44

More ifdefs for mono

fe8e4c4

Add back ExitIfLockTaken for JIT helper

8c0d783

Fix GC mode

1b4807b

Fix musl build and make gc modes explicit

ddffcac

Split GetLockObject's implementation into a fast FCall and a slow QCall

72d563f

Fix some asserts

bf4f5b0

Fix musl build

85b759f

FIx one more musl problem

1a9a833

Refactor out the object header spinlocking so we can use it on the th…

6a9b762

…inlock for thin->fat lock upgrading

jkoritzinsky added 3 commits September 4, 2025 14:36

Don't force EnterSlow to not be inlined.

0bc0143

Manually inline away the *Slow variants

64510c6

Remove SyncTable class. It isn't worth having a separate type at this…

e16bee3

… point

Fix naming issues

6bfc0b5

MihuBot mentioned this pull request Sep 4, 2025

[Benchmark X64] [jkoritzinsky] Move the implementation of Monitor to managed ... MihuBot/runtime-utils#1438

Open

jkoritzinsky added 2 commits September 5, 2025 13:13

Flip condition to skip out on the thin lock spin logic.

8255446

Don't try to do a quick-loop for release. Just go to the slow path li…

5bc026c

…ke how it's done in main.

jkoritzinsky added 5 commits September 8, 2025 12:44

Rewrite IsEntered implementation to match how Enter and Exit are impl…

02a6a06

…emented.

Actually fix the DBI asserts

6eaa48f

Try-Enter with a timeout should do the slow path if there's a timeout

d3ac8bd

Check for syncblk index first. Thread-id is only valid if this bit is…

5211992

… false

Unify enum definitions

21bd0c3

MihuBot mentioned this pull request Sep 10, 2025

[Benchmark X64] [jkoritzinsky] Move the implementation of Monitor to managed ... MihuBot/runtime-utils#1462

Open

MihuBot mentioned this pull request Sep 10, 2025

[Benchmark X64] [jkoritzinsky] Move the implementation of Monitor to managed ... MihuBot/runtime-utils#1464

Open

jkoritzinsky removed the blocked Issue/PR is blocked on something - see comments label Sep 10, 2025

jkoritzinsky requested a review from davidwrighton September 10, 2025 23:14

build-analysis bot mentioned this pull request Sep 15, 2025

[NativeAOT] Test_GetTotalAllocatedBytes test fails with OutOfMemoryException #119187

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move the implementation of Monitor to managed in CoreCLR #118371

Move the implementation of Monitor to managed in CoreCLR #118371

jkoritzinsky commented Aug 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jkoritzinsky commented Sep 4, 2025

Uh oh!

jkoritzinsky commented Sep 4, 2025

Uh oh!

MihaZupan commented Sep 5, 2025 •

edited

Loading

Uh oh!

jkotas commented Sep 5, 2025

Uh oh!

jkoritzinsky commented Sep 5, 2025

Uh oh!

jkoritzinsky commented Sep 5, 2025

Uh oh!

jkotas commented Sep 10, 2025

Uh oh!

MihuBot commented Sep 10, 2025

Uh oh!

jkotas commented Sep 10, 2025

Uh oh!

MihuBot commented Sep 10, 2025

Uh oh!

jkoritzinsky commented Sep 10, 2025

Uh oh!

jkotas commented Sep 14, 2025

Uh oh!

azure-pipelines bot commented Sep 14, 2025

Uh oh!

Uh oh!

Move the implementation of Monitor to managed in CoreCLR #118371

Are you sure you want to change the base?

Move the implementation of Monitor to managed in CoreCLR #118371

Conversation

jkoritzinsky commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jkoritzinsky commented Sep 4, 2025

Uh oh!

jkoritzinsky commented Sep 4, 2025

Uh oh!

MihaZupan commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jkotas commented Sep 5, 2025

Uh oh!

jkoritzinsky commented Sep 5, 2025

Uh oh!

jkoritzinsky commented Sep 5, 2025

Uh oh!

jkotas commented Sep 10, 2025

Uh oh!

MihuBot commented Sep 10, 2025

Uh oh!

jkotas commented Sep 10, 2025

Uh oh!

MihuBot commented Sep 10, 2025

Uh oh!

jkoritzinsky commented Sep 10, 2025

Uh oh!

jkotas commented Sep 14, 2025

Uh oh!

azure-pipelines bot commented Sep 14, 2025

Uh oh!

Uh oh!

jkoritzinsky commented Aug 5, 2025 •

edited

Loading

MihaZupan commented Sep 5, 2025 •

edited

Loading