-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Optimize DistributedPubSub memory allocation #7642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize DistributedPubSub memory allocation #7642
Conversation
Comparing the DPA recording of We can see that Note that this is not a 100% fix, some of the old memory allocation has been shifted to |
private static readonly Dictionary<string, string> TopicToEncodedMap = new(); | ||
private static readonly Dictionary<string, string> EncodedToTopicMap = new(); | ||
private static readonly Dictionary<MakeKeyInfo, string> MakeKeyMap = new(); | ||
private static readonly Dictionary<string, MakeKeyInfo> MakeKeyReverseMap = new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This strikes me as ever-so-slightly unsafe, in that if there are multiple ActorSystem
s running in the same app domain, you could have those systems callers potentially mutating a dictionary while another is trying to read...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right, here's an option I thought of:
- We can use
ImmutableDictionary
, but then we'll lose the memory allocation improvement - Another one is
ConcurrentDictionary
, but then we'll lose the performance improvement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though let me think about this a bit, this might actually be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, after discussing it with @Aaronontheweb, I'm going to move this as an actor non-static field instead, moving it to the Mediator and/or Topic/TopicLike class.
/// </summary> | ||
internal static class Utils | ||
{ | ||
private record MakeKeyInfo(ActorPath Path, string Topic); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stupid Question, have we checked this vs a private record readonly struct
as far as allocations vs performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet, I'll give that a go 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the final benchmark for this PR: OSVersion: Microsoft Windows NT 6.2.9200.0 dev
Original PR
readonly record struct
non-concurrent non-static cache (final)
|
/// <param name="path">TBD</param> | ||
/// <returns>TBD</returns> | ||
public static string MakeKey(ActorPath path) | ||
public string MakeKey(ActorPath path, string topic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could still make this static and do inlining, btw - just refactor these to be extension methods instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to avoid extension methods like a plague since it requires the outermost class declaration to be public static, didn't want to pollute the public API with internal API methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, I see what you're saying, that is a good idea
Latest benchmark and DPA report comparison:
dev:final PR: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This change also results in a slight bump of performance. Note that, although this reduces total memory allocation (Small Object Heap), it actually increases memory usage because it caches objects in memory.
This is the main 2 issues this PR trying to address:

ActorPath.Join()
ActorPath.op_Division()
Note
Although the total bytes reported in this PR is quite scary (in the Gb and Mb range), note that this number represent worst case accumulated memory allocation of a stress benchmark application where DistributedPubSub is being subjected to a burst of more than 20,000,000 messages in a very short period of time.
These are not memory leak, all of the allocated memories are reclaimed during GC.
Changes
ActorPath
String.Join()
and immutable operationsChecklist
For significant changes, please ensure that the following have been completed (delete if not relevant):
Latest
dev
BenchmarksOSVersion: Microsoft Windows NT 6.2.9200.0
ProcessorCount: 24
ClockSpeed: 0 MHZ
Actor Count: 48
Messages sent/received per client: 200000 (2e5)
Is Server GC: True
Thread count: 55
This PR's Benchmarks
OSVersion: Microsoft Windows NT 6.2.9200.0
ProcessorCount: 24
ClockSpeed: 0 MHZ
Actor Count: 48
Messages sent/received per client: 200000 (2e5)
Is Server GC: True
Thread count: 53