Skip to content

Conversation

Arkatufus
Copy link
Contributor

@Arkatufus Arkatufus commented May 14, 2025

This change also results in a slight bump of performance. Note that, although this reduces total memory allocation (Small Object Heap), it actually increases memory usage because it caches objects in memory.

This is the main 2 issues this PR trying to address:
Original problem

  1. The recorded 1.1 Gb total SOH allocation by ActorPath.Join()
  2. The recorded 575 Mb total SOH allocation by ActorPath.op_Division()

Note

Although the total bytes reported in this PR is quite scary (in the Gb and Mb range), note that this number represent worst case accumulated memory allocation of a stress benchmark application where DistributedPubSub is being subjected to a burst of more than 20,000,000 messages in a very short period of time.

These are not memory leak, all of the allocated memories are reclaimed during GC.

Changes

  • Cache RegEx pattern string replace results
  • Cache non-performant ActorPath String.Join() and immutable operations

Checklist

For significant changes, please ensure that the following have been completed (delete if not relevant):

Latest dev Benchmarks

OSVersion: Microsoft Windows NT 6.2.9200.0
ProcessorCount: 24
ClockSpeed: 0 MHZ
Actor Count: 48
Messages sent/received per client: 200000 (2e5)
Is Server GC: True
Thread count: 55

Num clients Total [msg] Msgs/sec Total [ms] Start Threads End Threads
1 200000 404041 495.04 55 67
5 1000000 401768 2489.68 72 88
10 2000000 413053 4842.76 92 119
15 3000000 425532 7050.14 123 118
20 4000000 433042 9237.34 123 115
25 5000000 417816 11967.61 119 114
30 6000000 429892 13957.77 119 115

This PR's Benchmarks

OSVersion: Microsoft Windows NT 6.2.9200.0
ProcessorCount: 24
ClockSpeed: 0 MHZ
Actor Count: 48
Messages sent/received per client: 200000 (2e5)
Is Server GC: True
Thread count: 53

Num clients Total [msg] Msgs/sec Total [ms] Start Threads End Threads
1 200000 505051 396.67 53 67
5 1000000 494560 2022.03 72 76
10 2000000 479847 4168.64 80 88
15 3000000 477403 6284.04 92 123
20 4000000 469374 8522.71 123 115
25 5000000 459348 10885.07 119 114
30 6000000 466636 12858.17 119 115

@Arkatufus
Copy link
Contributor Author

Arkatufus commented May 14, 2025

Comparing the DPA recording of dev and this PR:

DEV:
Original problem

PR:
After Changes

We can see that String.Join() and ActorPath.op_Division() does not show up anymore.

Note that this is not a 100% fix, some of the old memory allocation has been shifted to Utils.MakeKey() instead (unavoidable). Comparing to the total 1.6 Gb total SOH allocated by String.Join() and ActorPath.op_Division(), the Utils.MakeKey() 325 Mb SOH allocation is a lot more acceptable.

private static readonly Dictionary<string, string> TopicToEncodedMap = new();
private static readonly Dictionary<string, string> EncodedToTopicMap = new();
private static readonly Dictionary<MakeKeyInfo, string> MakeKeyMap = new();
private static readonly Dictionary<string, MakeKeyInfo> MakeKeyReverseMap = new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This strikes me as ever-so-slightly unsafe, in that if there are multiple ActorSystems running in the same app domain, you could have those systems callers potentially mutating a dictionary while another is trying to read...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, here's an option I thought of:

  • We can use ImmutableDictionary, but then we'll lose the memory allocation improvement
  • Another one is ConcurrentDictionary, but then we'll lose the performance improvement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though let me think about this a bit, this might actually be safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, after discussing it with @Aaronontheweb, I'm going to move this as an actor non-static field instead, moving it to the Mediator and/or Topic/TopicLike class.

/// </summary>
internal static class Utils
{
private record MakeKeyInfo(ActorPath Path, string Topic);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stupid Question, have we checked this vs a private record readonly struct as far as allocations vs performance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, I'll give that a go 👍

Copy link
Contributor Author

@Arkatufus Arkatufus May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like converting this to private readonly record struct fixes the MakeKey() SOH memory allocation problem 👍

readonly record struct

@Arkatufus
Copy link
Contributor Author

Here's the final benchmark for this PR:

OSVersion: Microsoft Windows NT 6.2.9200.0
ProcessorCount: 24
ClockSpeed: 0 MHZ
Actor Count: 48
Messages sent/received per client: 200000 (2e5)
Is Server GC: True
Thread count: 55

dev

Num clients Total [msg] Msgs/sec Total [ms] Start Threads End Threads
1 200000 628931 318.97 55 60
5 1000000 489237 2044.28 65 76
10 2000000 469264 4262.58 80 83
15 3000000 461326 6503.55 87 111
20 4000000 463554 8629.79 111 103
25 5000000 476690 10489.83 107 102
30 6000000 456032 13157.05 107 103

Original PR

Num clients Total [msg] Msgs/sec Total [ms] Start Threads End Threads
1 200000 649351 308.97 56 62
5 1000000 535332 1868.81 67 76
10 2000000 523287 3822.25 80 84
15 3000000 508906 5895.20 88 112
20 4000000 496155 8062.91 116 107
25 5000000 505051 9900.97 112 105
30 6000000 489796 12250.96 109 104

readonly record struct

Num clients Total [msg] Msgs/sec Total [ms] Start Threads End Threads
1 200000 689656 290.12 53 59
5 1000000 539666 1853.97 64 77
10 2000000 543331 3681.97 81 83
15 3000000 519481 5775.60 87 113
20 4000000 519751 7696.44 117 110
25 5000000 525763 9510.34 115 107
30 6000000 500793 11981.83 111 106

non-concurrent non-static cache (final)

Num clients Total [msg] Msgs/sec Total [ms] Start Threads End Threads
1 200000 598803 334.60 54 58
5 1000000 523561 1910.88 63 76
10 2000000 508647 3932.46 80 82
15 3000000 505391 5936.83 86 110
20 4000000 521309 7673.12 114 106
25 5000000 492320 10156.61 111 103
30 6000000 503694 11912.79 107 102

/// <param name="path">TBD</param>
/// <returns>TBD</returns>
public static string MakeKey(ActorPath path)
public string MakeKey(ActorPath path, string topic)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could still make this static and do inlining, btw - just refactor these to be extension methods instead.

Copy link
Contributor Author

@Arkatufus Arkatufus May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to avoid extension methods like a plague since it requires the outermost class declaration to be public static, didn't want to pollute the public API with internal API methods.

Copy link
Contributor Author

@Arkatufus Arkatufus May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, I see what you're saying, that is a good idea

@Arkatufus
Copy link
Contributor Author

Latest benchmark and DPA report comparison:

Num clients Total [msg] Msgs/sec Total [ms] Start Threads End Threads
1 200000 687286 291.60 54 58
5 1000000 515199 1941.54 63 79
10 2000000 517197 3867.96 83 84
15 3000000 511859 5861.68 88 112
20 4000000 526109 7603.70 116 109
25 5000000 504185 9917.22 114 106
30 6000000 503356 11920.60 110 105

dev:

Original problem

final PR:

Final PR

Copy link
Member

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This was referenced Oct 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants