Skip to content

Ack broken entry succeed in ensemble change unsetting #3005

@kezhuw

Description

@kezhuw

BUG REPORT

Describe the bug
Ack broken entry succeed in ensemble change unsetting.

Besides this, LedgerHandle.pendingAddOps were read and then poped in nested:

  • Read in LedgerHandle.unsetSuccessAndSendWriteRequest.
  • Poped in call chain PendingAddOp.unsetSuccessAndSendWriteRequest to LedgerHandle.sendAddSuccessCallbacks

This guess this is dangerous.

To Reproduce
I created a test case branch in my fork. I repeated the scenario that test built here.

This test constructs a scenario:

  • Ensemble size is two, write/ack quorum size is one. Let's assume
    initial ensemble members are b0 and b1.
  • Write four entries e0, e2, .. e3. The distribution algorithm will
    write e0 and e2 to b0, e1 and e3 to b1.
  • e1 completes its write. Bookie b1 crashes. Ensemble changing initiates.
    e0 completes its write but blocked from success due to ensemble changing.
  • Ensemble changing completes. The ledger will unset success and resend
    write requests if needed.

In this scenario, after ensemble changed, all pending entries should proceed
to success finally. Especially, e1 should unset its success and resend its write.

In test case, e1 was confirmed as completed though its ack set broken by ensemble change.

Expected behavior
Ack broken entry should resend their write request for ack fulfillment again.

Screenshots
None.

Additional context
In the test case branch, I pushed a possible fix commit kezhuw@207a140.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions