-
Notifications
You must be signed in to change notification settings - Fork 524
Description
Description
Scope: Applicable with and without the "Advanced Replica Selection" feature.
Problem: One or our recent conversations with the IC3 team has helped us unveiling a potential issue with the RNTBD connection creation and management flow in the LoadBalancingPartition. The team is using the .NET SDK version 3.39.0-preview to connect to cosmos db. Recently they encountered a potential memory leak in some of their clusters and upon investigation, it appeared that one of the root cause is the underlying CosmosClient is keeping a high number of unhealthy and unused LbChannelStates.
In a nut shell, below are few of the account configurations and facts:
• account-1: 9 Partitions with 1 unique tenant. There are approx 4 to 8 clients for this tenant. 2 * no of replica regions is the client count. They have the connection warm-up enabled on this account.
• account-2 2592 Partitions with 249 tenants/ feds. Connections created in happy path scenario: 249 x Y (Y = number of active clients for that account). Connection warm-up disabled on this account.
• account-3: 27 Partitions with 13 tenants/ feds. CreateAndInitialize They have the connection warm-up enabled on this account.
To understand this in more detail, please take a look at the memory dump below.
[Fig-1: The above figure shows a snapshot of the memory dump taken for multiple accounts This also unviels the potential memory leak by the unhealthy connection.]
Upon further analysis of the memory dump, it is clear that:
-
The number of stale unhealthy connections are higher in the accounts where we have the replica validation enabled along with the connection warm-up.
-
Without the connection warm-up, the number of stale unhealthy connections are comparatively lower, but still good enough to increase the memory foot-print.
[Fig-2: The above figure shows how the memory footprint got increased over time, along with incoming requests. The service was finally needed to be restarted to free up the memory]
-
Even without the replica validation feature, the memory footprint showed a consistent increase over time.
[Fig-3: The above figure shows the memory consumption from the IC3 partner-api service which is using an older version (v 3.25.0) of the .NET SDK and the memory consumption kept increasing with time.]
Analysis : Upon further digging in to the memory dump, and re-producing the scenario locally, it was noted that:
-
With Replica Validation Enabled: Each of the impacted
LoadBalancingPartitionwas holding more than1Unhealthy staleLbChannelState(which is a wrapper around theDispatcherand aChannel), when the connection to the backend replica was closed deterministically. -
With Replica Validation Disabled: Each of the impacted
LoadBalancingPartitionwas holding exactly1Unhealthy staleLbChannelState(which is a wrapper around theDispatcherand aChannel), when the connection to the backend replica was closed deterministically.
Let's take a look at the below diagram to understand this in more detail:
[Fig-4: The above figure shows an instance of the LoadBalancingPartition holding more than one entry of unhealthy LbChannelState.]
By looking at the above memory dump snapshot, it is clear that these stale LbChannelState entries are kept in the LoadBalancingPartition, until they are removed the openChannels list, which is responsible for maintaining the number of channels (healthy or unhealthy) for that particular endpoint. If they are not cleaned up proactively (which is exactly this case), it might end up claiming extra memory overhead. With increasing number of partitions, connections over the time, things get worse with all these unused, yet low hanging LbChannelStates claiming more and more memory, and causing it to a memory leak. This is the potential root cause of the increased memory consumption.
Proposed Solution :
There are few changes proposed to fix this scenario. These are discussed briefly in the below section:
- During the replica validation phase, in the
OpenConnectionAsync(), proactively remove all theUnhealthyconnections from theopenChannelswithin theLoadBalancingPartition. This guarantees that any unhealthyLbChannelStateswill be removed from theLoadBalancingPartition, freeing up the additional memory. - Yet to be identified: Figure out ways to avoid opening duplicate connections to the same endpoint, multiple times.
FAQs:
-
Is this applicable only for the newer version of the SDKs?
Ans: No, the scenario can happen with the older version of the SDK too. As discussed in the above sections, the root cause of this problem is with the connection management and clean-up. It is observed from the memory dump that with older version of the SDKs, each impactedLoadBalancingPartitionholds exactly one stale connection. Thus, with increasing number of connections, the memory utilization can be increased due to these unused stale connection stying in the memory. -
Does enabling "Advanced Replica Selection" make the memory consumption Worse"?
Ans: Yes, it does. The advanced replica selection feature is designed in such a way that it keeps track of any Unhealthy replica that had a connectivity issue, and temporarily quarantines it so that the incoming requests has higher chance of landing on to a Healthy replica. Additionally, the feature validates the Unhealthy replica by proactively opening a connection to check if the replica came back up. This helps dramatically to reduce the laency for read workloads, when a replica undergoes upgrades etc. However, these proactive open connections can potentially increase the number of stale connections, when that connection is closed from BE for idleness. This has a larger impact on increasing the memory footprint today.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status



