-
Notifications
You must be signed in to change notification settings - Fork 964
Description
BUG REPORT
Describe the bug
Ack broken entry succeed in ensemble change unsetting.
Besides this, LedgerHandle.pendingAddOps were read and then poped in nested:
- Read in
LedgerHandle.unsetSuccessAndSendWriteRequest. - Poped in call chain
PendingAddOp.unsetSuccessAndSendWriteRequesttoLedgerHandle.sendAddSuccessCallbacks
This guess this is dangerous.
To Reproduce
I created a test case branch in my fork. I repeated the scenario that test built here.
This test constructs a scenario:
- Ensemble size is two, write/ack quorum size is one. Let's assume
initial ensemble members are b0 and b1. - Write four entries e0, e2, .. e3. The distribution algorithm will
write e0 and e2 to b0, e1 and e3 to b1. - e1 completes its write. Bookie b1 crashes. Ensemble changing initiates.
e0 completes its write but blocked from success due to ensemble changing. - Ensemble changing completes. The ledger will unset success and resend
write requests if needed.
In this scenario, after ensemble changed, all pending entries should proceed
to success finally. Especially, e1 should unset its success and resend its write.
In test case, e1 was confirmed as completed though its ack set broken by ensemble change.
Expected behavior
Ack broken entry should resend their write request for ack fulfillment again.
Screenshots
None.
Additional context
In the test case branch, I pushed a possible fix commit kezhuw@207a140.