-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Version Information
Version of Akka.NET? v1.4.28
Which Akka.NET Modules? Akka, Akka.Remote
Describe the performance issue
Our RemotePingPong
benchmark has been the standard used for 7 years or so to measure throughput passing over a single Akka.Remote connection between two ActorSystem
instances. It's a crucial benchmark because it measures the biggest bottleneck in Akka.Remote networks: the end to end response time over a single connection.
Over the lifespan of .NET Core since 2017 we've seen steady improvements in the benchmark numbers each time a new version of the .NET runtime is released usually as an improvement of underlying threading / concurrency / IO primitives introduced into the runtime itself.
With the release of .NET 6, however, we've noticed that while the overall throughput in some measures remains higher than on .NET 5 for these same reasons - there are steady, reproducible, long-lasting drops in total throughput that occur only on .NET 6.
Data and Specs
Here are the RemotePingPong numbers from my local development machine, a Gen 1 8-core Ryzen, on .NET Core 3.1:
Edit: update the .NET Core 3.1 benchmark numbers to include the settings from #5386
And here are the equivalent numbers for this same benchmark on .NET 6:
I've been able to reproduce this consistently - a sustained drop in throughput that lasts for roughly 30s. We've also noticed this in the Akka.NET test suite since merging in #5373 - the number of failures in the test suite has grown and has started to include tests that historically have not been racy. We've also observed this separately in the Phobos repository which we also upgraded to use the .NET 6 SDK.
There is definitely something amiss here with how Akka.NET runs on top of .NET 6.
Expected behavior
A consistent level of performance across all benchmarks.
Actual behavior
Intermittent lag, declines in throughput, and unexplained novel race conditions.
Environment
.NET 6, Windows
Additional context
There is some speculation from other members of the Akka.NET team that the issue could be related to some of the .NET ThreadPool and thread injection changes made in .NET 6:
- Improve the rate of thread injection for blocking due to sync-over-async dotnet/runtime#53471
- Enable the portable thread pool by default in coreclr dotnet/runtime#43841
- And a performance that was reported and later fixed pertaining to the new managed threadpool implementation: [Perf] Widespread regressions after threadpool change dotnet/runtime#44211