- 
                Notifications
    You must be signed in to change notification settings 
- Fork 8.9k
feature: implement early rollback of global transactions when TM disconnects #7674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 2.x
Are you sure you want to change the base?
feature: implement early rollback of global transactions when TM disconnects #7674
Conversation
…onnects - TMDisconnectHandler interface for handling TM disconnect events - DefaultTMDisconnectHandler implementation with VGroup-based matching - Configuration option: server.enableRollbackWhenDisconnect (default: false) - Integration with AbstractNettyRemotingServer for TM disconnect detection - Comprehensive test coverage with unit and integration tests
c599475    to
    f772c72      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements early rollback of global transactions when Transaction Manager (TM) disconnects to improve system performance by reducing resource lock duration from 60 seconds to <1 second.
Key Changes:
- Added TM disconnect detection and handling infrastructure
- Implemented VGroup-based transaction matching for early rollback
- Added configuration toggle for the feature (disabled by default)
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description | 
|---|---|
| TMDisconnectHandler.java | Interface for handling TM disconnect events | 
| DefaultTMDisconnectHandler.java | Implementation with VGroup/ApplicationId matching logic | 
| AbstractNettyRemotingServer.java | Integration of TM disconnect detection in server | 
| DefaultCoordinator.java | Initialization of TM disconnect handler | 
| ConfigurationKeys.java&DefaultValues.java | Configuration constants for feature toggle | 
| Test files | Comprehensive unit and integration tests | 
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|  | ||
| } catch (TransactionException e) { | ||
| LOGGER.error( | ||
| "Failed to rollback transaction [{}] {} {}", | 
    
      
    
      Copilot
AI
    
    
    
      Oct 9, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message format is unclear with three consecutive placeholders without descriptive text. Consider a more descriptive format like 'Failed to rollback transaction [{}]: code={}, message={}'
| "Failed to rollback transaction [{}] {} {}", | |
| "Failed to rollback transaction [{}]: code={}, message={}", | 
| * | ||
| * @return the session manager | ||
| */ | ||
| public SessionManager getSessionManager() { | 
    
      
    
      Copilot
AI
    
    
    
      Oct 9, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is exposed solely for testing but exists in production code. Consider using package-private visibility instead of public, or use dependency injection to make the code more testable.
| public SessionManager getSessionManager() { | |
| SessionManager getSessionManager() { | 
| Where did you handle the  | 
| 
 @YongGoose Event flow: The commit diff doesn't show  Did I misunderstand your concern? If the explanation above isn't clear, I'm happy to add code comments or documentation to make the event flow more explicit. | 
| Codecov Report❌ Patch coverage is  Additional details and impacted files@@             Coverage Diff              @@
##                2.x    #7674      +/-   ##
============================================
+ Coverage     61.05%   61.76%   +0.70%     
- Complexity      670      684      +14     
============================================
  Files          1316     1325       +9     
  Lines         49804    50104     +300     
  Branches       5855     5917      +62     
============================================
+ Hits          30407    30945     +538     
+ Misses        16689    16373     -316     
- Partials       2708     2786      +78     
 🚀 New features to boost your workflow:
 | 
changes/en-us/2.x.mdandchanges/zh-cn/2.x.md.Ⅰ. Describe what this PR did
This PR implements early rollback of global transactions when Transaction Manager (TM) disconnects, addressing issue #4422.
Key Features:
TMDisconnectHandlerinterface for handling TM disconnect eventsDefaultTMDisconnectHandlerwith VGroup-based matching logicAbstractNettyRemotingServerserver.enableRollbackWhenDisconnect(default: false)Implementation Details:
server.enableRollbackWhenDisconnect=false(disabled by default for safety)Ⅱ. Does this pull request fix one issue?
Yes, this PR fixes issue #4422: "rollback of global transactions ahead of time"
Problem: When TM crashes or disconnects, global transactions remain in BEGIN status for 60 seconds (default timeout), blocking resources and degrading system performance.
Solution: This feature enables immediate TM disconnect detection and early rollback processing, reducing resource lock time from 60 seconds to <1 second.
Ⅲ. Why don't you add test cases (unit test/integration test)?
I have added comprehensive test coverage:
Unit Tests:
DefaultTMDisconnectHandlerTest: Tests core rollback logic and edge casesAbstractNettyRemotingServerTMDisconnectTest: Tests server-side integrationIntegration Tests:
TMDisconnectIntegrationTest: Tests end-to-end TM disconnect scenariosDefaultCoordinatorInitTest: Tests handler initialization and wiringTest Coverage:
Ⅳ. Describe how to verify it
Testing:
Manual Verification:
server.enableRollbackWhenDisconnect=truePerformance Test:
Ⅴ. Special notes for reviews
Safety Considerations:
server.enableRollbackWhenDisconnect=false)Key Code Areas:
DefaultTMDisconnectHandler.shouldRollbackSession(): Core matching logicAbstractNettyRemotingServer.handleDisconnect(): TM disconnect detectionPerformance Impact:
Breaking Changes:
server.enableRollbackWhenDisconnect=falseAdditional Notes: