-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Use unbounded SemaphoreSlim #12443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Use unbounded SemaphoreSlim #12443
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR replaces the problematic race condition-prone pattern of using a bounded SemaphoreSlim
with count checking with an unbounded SemaphoreSlim
in the NodeContext packet queue system. The change addresses thread safety issues that could occur when multiple threads attempt to enqueue packets simultaneously, particularly on ARM64 architectures with weaker memory models.
Key changes:
- Replace
AutoResetEvent
andCancellationTokenSource
with unboundedSemaphoreSlim
for packet queue signaling - Convert packet draining from dedicated thread to async Task-based approach
- Add memory barriers around packet enqueue operations to ensure proper ordering
- Refactor shutdown logic to use proper resource disposal patterns
@@ -789,30 +787,33 @@ public void SendData(INodePacket packet) | |||
{ | |||
_exitPacketState = ExitPacketState.ExitPacketQueued; | |||
} | |||
|
|||
Thread.MemoryBarrier(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thread.MemoryBarrier() creates a full memory fence which can be expensive. Consider using Interlocked operations or volatile semantics if specific ordering guarantees are needed, as the .NET memory model already provides sufficient guarantees for most scenarios.
Copilot generated this review using guidance from repository custom instructions.
_packetWriteQueue.Enqueue(packet); | ||
_packetEnqueued.Set(); | ||
|
||
Thread.MemoryBarrier(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second Thread.MemoryBarrier() call may be unnecessary. The SemaphoreSlim.Release() method already provides appropriate memory ordering guarantees for signaling the waiting thread.
Copilot generated this review using guidance from repository custom instructions.
using (var cts = new CancellationTokenSource(100)) | ||
{ | ||
await Task.WhenAny(_packetWriteDrainTask, Task.Delay(100, cts.Token)); | ||
cts.Cancel(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling cts.Cancel() after the Task.WhenAny completes is unnecessary and potentially problematic. The CancellationTokenSource is already disposed in the using block, and cancellation should only be called if you need to interrupt the delay task early.
cts.Cancel(); |
Copilot uses AI. Check for mistakes.
|
||
if (packet is NodeBuildComplete) | ||
{ | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition duplicates exit logic that's already handled by IsExitPacket(). Consider consolidating the exit conditions or clearly documenting why NodeBuildComplete requires separate handling from other exit packets.
if (packet is NodeBuildComplete) | |
{ | |
return; | |
} | |
// If NodeBuildComplete packets with PrepareForReuse == true require special handling, | |
// document the reason here. Otherwise, rely on IsExitPacket for all exit logic. |
Copilot uses AI. Check for mistakes.
This is a non-trivial change and we already got burned. We will dig into it deeper. |
Fixes #
Context
This is a modification of #12113 that uses a
SemaphoreSlim
without a cap on the number of times it can be released. The previous code checked the count property to see ifRelease()
needed to be called since calling it when there is a maximum specified can throw an exception if we're already at the max.However, this pattern has a potential race condition if more than one thread tries to enqueue a packet at once. Multiple threads read
0
fromCurrentCount
and try to callRelease()
which causes an exception to be thrown. Additionally, the ARM64 architecture has a weaker memory model than x86/64 which can lead to unintended behavior. The .NET runtime maintains its own memory model rules that ensure correctness across platforms. Of note in this case is this section about order of memory operations."The effects of ordinary reads and writes can be reordered as long as that preserves single-thread consistency."
Since the runtime only needs to guarantee single-thread consistency, the read of something like
CurrentCount
could be read before the packet is enqueued, and this could lead toDrainPacketQueue
never being signaled.Changes Made
Testing
Notes