Skip to content

Conversation

@zuston
Copy link
Member

@zuston zuston commented Jan 18, 2024

What changes were proposed in this pull request?

introduce the JvmPauseMonitor to detect the gc pause

Why are the changes needed?

Fix: #1466

We have some loop check logic in shuffle-server, sometimes the GC pause will make some pre-allocated buffer removed.
If we have this monitor, we will find out this reason rather than guess. Although the JVM metrics are also valid.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing tests

@zuston zuston requested a review from jerqi January 18, 2024 07:23
@codecov-commenter
Copy link

codecov-commenter commented Jan 18, 2024

Codecov Report

Attention: 81 lines in your changes are missing coverage. Please review.

Comparison is base (25d0585) 54.12% compared to head (918c3cd) 54.91%.

Files Patch % Lines
...rg/apache/uniffle/common/util/JvmPauseMonitor.java 0.00% 80 Missing ⚠️
.../java/org/apache/uniffle/server/ShuffleServer.java 80.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #1470      +/-   ##
============================================
+ Coverage     54.12%   54.91%   +0.79%     
  Complexity     2773     2773              
============================================
  Files           423      404      -19     
  Lines         24215    21939    -2276     
  Branches       2062     2071       +9     
============================================
- Hits          13106    12048    -1058     
+ Misses        10296     9148    -1148     
+ Partials        813      743      -70     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jerqi
Copy link
Contributor

jerqi commented Jan 18, 2024

Why do we need this after we already have JVM metrics?

@zuston
Copy link
Member Author

zuston commented Jan 18, 2024

Why do we need this after we already have JVM metrics?

Sometimes the metric system is not enabled. BTW this class is referred from Hadoop https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/JvmPauseMonitor.java

Copy link
Contributor

@jerqi jerqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @zuston

@zuston zuston merged commit 6f538a7 into apache:master Jan 19, 2024
zuston added a commit to zuston/incubator-uniffle that referenced this pull request Mar 19, 2024
…he gc pause (apache#1470)

### What changes were proposed in this pull request?

Introduce the JvmPauseMonitor to detect the gc pause

### Why are the changes needed?

Fix: apache#1466 

We have some loop check logic in shuffle-server, 
sometimes the GC pause will make some pre-allocated buffer  removed.
If we have this monitor, we will find out this reason rather than guess. 
Although the JVM metrics are also valid.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Introduce GC pause monitor

3 participants