Skip to content

Conversation

@jeffrey-xiao
Copy link

@jeffrey-xiao jeffrey-xiao commented Sep 30, 2022

This PR resolves ZOOKEEPER-4394.

The first commit adds a failing test that demonstrates that a leader sending an outstanding proposal during a DIFF will cause an NPE if the commit to the proposal comes before the UPTODATE.

The second commit fixes the failing test and addresses the problem by splitting packetsNotCommitted to a separate field called packetsNotLogged. The former keeps track of packets that needs to be committed while the latter keeps track of packets that needs to be logged.

See individual commit messages for more details.

This test demonstrates that a leader sending outstanding proposals with
a DIFF will cause an NPE if the commit to that proposal comes before the
UPTODATE.
This commit fixes the failing test and addresses ZOOKEEPER-4394.

The main problem with the learner code is that the purpose of
`packetsNotCommitted` is a bit overloaded -- it represents two things:
1. The packets that we need to durably log.
2. The packets that we haven't committed yet.

However, there are cases where these two sets of packets are not
identical. For example, suppose we are catching up via DIFF and we
receive an outstanding PROPOSAL before NEWLEADER and the COMMIT after
NEWLEADER, but before UPTODATE. In this case, we need to durably log all
proposals before we ACK the NEWLEADER, but we still need to apply the
PROPOSAL when we receive the commit.

This commit separates `packetsNotCommitted` into an additional field
`packetsNotLogged` that represents the first set of packets.
@sonatype-lift
Copy link

sonatype-lift bot commented Sep 30, 2022

⚠️ 52 God Classes were detected by Lift in this project. Visit the Lift web console for more details.

@tsuna
Copy link

tsuna commented Jun 13, 2024

Is this PR getting superseded by #2152?

@tsuna
Copy link

tsuna commented Aug 8, 2024

Ping

Copy link
Contributor

@anmolnar anmolnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea looks good to me:

  • use packetsNotCommitted only for matching PROPOSALS with COMMITS
  • use packetsNotLogged for dealing with outstanding TXNs to be persisted after NEWLEADER message.

Is the above accurate?

However I don't understand why don't you just resolve an NFE with a single null check?

Comment on lines +653 to +658
if (pif.hdr.getZxid() != qp.getZxid()) {
LOG.warn(
"Committing 0x{}, but next proposal is 0x{}",
Long.toHexString(qp.getZxid()),
Long.toHexString(pif.hdr.getZxid()));
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this branch is correct, because in the old logic if writeToTxnLog == true, then the packet is added to packetsCommitted no matter if zxid matches with the next proposal or not.

In your new logic, if I understand it right, if zxid differs you'll quit fast and log the above warning message only.

Is that correct?

packetsCommitted.add(qp.getZxid());
}
} else {
packetsCommitted.add(qp.getZxid());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here. This logic is not handled in the new code for some reason.

@anmolnar
Copy link
Contributor

Is this PR getting superseded by #2152?

@kezhuw Do you agree? Can we close this patch in favor of #2152 ?

@kezhuw
Copy link
Member

kezhuw commented Sep 19, 2024

I would prefer #2152. @anmolnar

@kezhuw kezhuw closed this Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants