-
Notifications
You must be signed in to change notification settings - Fork 7.3k
ZOOKEEPER-1675: Make sync a quorum operation #2069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
ctubbsii
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to have an explicit public API method for this? It seems unreliable if programmers don't know how the ZK servers are configured, and trying to dynamically set a system property prior to calling an API method doesn't seem like a great option.
ZooKeeper have both client API and server daemon, it is somewhat inevitable if we change server side behavior of an api.
Not sure. I think I think the question for us should be "Is it a bug for If it is yes, then all above questions shouldn't be issues. Otherwise, we should resort to new apis as you said, for example ZOOKEEPER-3600(#1137). I am leaning towards it is a bug, so here we are. |
|
@kezhuw Sorry, I might be confused about the relationship between the proposed quorumSync parameter and the behavior that users see. I agree This is the situation I'm imagining, so please correct me if I'm misunderstanding something:
The user has no way to know what the server's configuration is set to. In both scenarios, the user's actions are the same... they call the same APIs to sync and read. The problem I'm seeing is that the user has no way to know whether they are seeing the buggy behavior or not. So, it's an unreliable experience. On the other hand, if there was a separate API, the user could explicitly call it:
In scenarios 3 and 4, the user can reliably count on the documented behavior, based on the method they call. In scenarios 1 and 2, they cannot... they have to have some insight into the server-side configuration, which they cannot know, in order to have any chance at relying on the correct behavior of So, I conclude that it'd be better to:
|
Will this cause much confusion in world after 3.10.0 ? Does client really want to choose to "dated data" ?
I am positive to this approach. I think we probably are aligned to make |
No, you are probably right. I can't imagine anybody would want this. I was only thinking for consistency of current behavior. But, I don't think there's a use case for that.
💯 |
|
Reopen for failed tests: ZOOKEEPER-4216 and ZOOKEEPER-4512. |
eolivelli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about this change.
IIRC the sync() operation ensures that the server you are connected to is up-to-date and then you can read(). This works well when you write to ZK to some peer, then from another node of your application you read from a different ZK peer.
If we change what sync() does at the moment and make it more heavyweight we are going to break applications, in production, because the load on ZK will increase.
This is something that you would see only in production underload, because developers working locally won't notice the difference.
If you want a different 'sync' then we must provide a new API:
- add a flag on the request (not sure it is doable with JUTE)
- add a new request type
We can discuss on the ML about why you need this
| // getting a quorum from all necessary configurations. | ||
| if (!p.hasAllQuorums()) { | ||
| return false; | ||
| Proposal previous = outstandingProposals.get(zxid - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change seems unrelated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is changed for:
- Make sure
phas majority acked. - Commit also preceding quorum read to guard against downgrading.
Previously, we return directly if there is preceding proposal, but now we are committing it if preceding proposal is a quorum read. So the order becomes matter.
|
In any case the client must explicitly opt-im to the new sync(). |
It is not guaranteed currently in case of leader change. QuorumSyncTest will fail if you change
Two cents from my side.
I think the purposes are clear when people resort to
Though, I am not positive to this road, but I guess we can resort to a per-client option, say The main doubt from me is that why people are intentionally want
I will prepare a discussion thread in dev mailing list later. |
|
I have started a discussion thread for the direction: https://lists.apache.org/thread/ogbg4sptpz56cwjbcvcpnysryr0c0pjm |
bafbb2e to
50a0cb8
Compare
Previously, `sync` + `read` could not guarantee up-to-date data as `sync` will not touch quorum in case of no outstanding proposals. Though, `create`/`setData` could be used as an rescue, but it is apparently heavy, ugly and error-prone. `sync` fits the semantics naturally. This pr bumps the quorum protocol version to make changes compatible with rolling upgrade. This is because `sync` is a public API. The whole cluster must function normally in rolling upgrade. `sync` will behave like a quorum operation once all forwarding followers are upgraded to the new version. This pr issues a quorum operation only when there is no outstanding proposals, so to avoid overloading possibly heavy loading cluster. It will increase latency in this case, but `sync` + `read` should care more about up-to-date data. This pr also reverts ZOOKEEPER-2137 which using `setData` to circumvent old behavior of `sync`. Refs: ZOOKEEPER-1675, ZOOKEEPER-2136, ZOOKEEPER-3600
50a0cb8 to
3741e09
Compare
Previously,
sync+readcould not guarantee up-to-date data assyncwill not touch quorum in case of no outstanding proposals.Though,
create/setDatacould be used as an rescue, but it is apparently ugly and error-prone.syncfits the semantics naturally.This pr reverts ZOOKEEPER-2137 which using
setDatato circumvent no quorumsync.Since
syncis a public API, so this pr bump quorum protocol version to compatible with rolling upgrade.syncwill only function like a quorum operation when all forwarding followers are upgraded.Refs: ZOOKEEPER-1675, ZOOKEEPER-2136, ZOOKEEPER-3600(#1137)