-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Closed
Description
Issue Summary
The safeUpdate() method in HealthCheckImpl has a bug where it notifies listeners of status changes even when the actual stored state is not updated, leading to false failover triggers and HEALTHY/UNHEALTHY state toggling.
Root Cause Analysis
Bug
- Scenario: An older health check result arrives after a newer one has already been stored
- Expected Behavior: State should NOT be updated (timestamp check prevents this )
- Listeners should NOT be notified (no actual state change occurred)
- Actual Behavior:
- State is correctly NOT updated
- But listeners are incorrectly notified because oldResult.getStatus() != status compares the stored status against the rejected input status
Example Scenario
Time: T1 - Health check starts (timestamp: 1000)
Time: T2 - Health check starts (timestamp: 2000)
Time: T3 - T2 completes first: HEALTHY → stored (timestamp: 2000)
Time: T4 - T1 completes later: UNHEALTHY → rejected (older timestamp)
- But listeners notified: HEALTHY → UNHEALTHY (FALSE!)
- Actual state remains: HEALTHY
Problematic Code:
private void safeUpdate(long owner, HealthStatus status) {
HealthCheckResult newResult = new HealthCheckResult(owner, status);
HealthCheckResult oldResult = resultRef.getAndUpdate(current -> {
if (current.getTimestamp() < owner) {
return newResult; // State updated
}
return current; // State NOT updated
});
if (oldResult.getStatus() != status) { // : Always compares against input status
// notify listeners
notifyListeners(oldResult.getStatus(), status); // False notification!
}
}Metadata
Metadata
Assignees
Labels
No labels