-
Notifications
You must be signed in to change notification settings - Fork 7.3k
ZOOKEEPER-4394: Learner.syncWithLeader got NullPointerException #1930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This test demonstrates that a leader sending outstanding proposals with a DIFF will cause an NPE if the commit to that proposal comes before the UPTODATE.
This commit fixes the failing test and addresses ZOOKEEPER-4394. The main problem with the learner code is that the purpose of `packetsNotCommitted` is a bit overloaded -- it represents two things: 1. The packets that we need to durably log. 2. The packets that we haven't committed yet. However, there are cases where these two sets of packets are not identical. For example, suppose we are catching up via DIFF and we receive an outstanding PROPOSAL before NEWLEADER and the COMMIT after NEWLEADER, but before UPTODATE. In this case, we need to durably log all proposals before we ACK the NEWLEADER, but we still need to apply the PROPOSAL when we receive the commit. This commit separates `packetsNotCommitted` into an additional field `packetsNotLogged` that represents the first set of packets.
|
|
|
Is this PR getting superseded by #2152? |
|
Ping |
anmolnar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea looks good to me:
- use
packetsNotCommittedonly for matching PROPOSALS with COMMITS - use
packetsNotLoggedfor dealing with outstanding TXNs to be persisted after NEWLEADER message.
Is the above accurate?
However I don't understand why don't you just resolve an NFE with a single null check?
| if (pif.hdr.getZxid() != qp.getZxid()) { | ||
| LOG.warn( | ||
| "Committing 0x{}, but next proposal is 0x{}", | ||
| Long.toHexString(qp.getZxid()), | ||
| Long.toHexString(pif.hdr.getZxid())); | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this branch is correct, because in the old logic if writeToTxnLog == true, then the packet is added to packetsCommitted no matter if zxid matches with the next proposal or not.
In your new logic, if I understand it right, if zxid differs you'll quit fast and log the above warning message only.
Is that correct?
| packetsCommitted.add(qp.getZxid()); | ||
| } | ||
| } else { | ||
| packetsCommitted.add(qp.getZxid()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here. This logic is not handled in the new code for some reason.
This PR resolves ZOOKEEPER-4394.
The first commit adds a failing test that demonstrates that a leader sending an outstanding proposal during a DIFF will cause an NPE if the commit to the proposal comes before the UPTODATE.
The second commit fixes the failing test and addresses the problem by splitting
packetsNotCommittedto a separate field calledpacketsNotLogged. The former keeps track of packets that needs to be committed while the latter keeps track of packets that needs to be logged.See individual commit messages for more details.