-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix: Don't remove non-concluding tx from queue on ooo runs #1427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Vladislav Oleshko <[email protected]>
src/server/engine_shard_set.cc
Outdated
bool keep = trans->RunInShard(this, true); | ||
DLOG_IF(INFO, !dbg_id.empty()) << "Eager run " << sid << ", " << dbg_id << ", keep " << keep; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I should add there a DCHECK_EQ(trans->GetTxqPos(shard) != TxQueue::kEnd, keep)
? Requires adding GetTxqPos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
Signed-off-by: Vladislav Oleshko <[email protected]>
src/server/engine_shard_set.cc
Outdated
@@ -390,7 +390,7 @@ void EngineShard::PollExecution(const char* context, Transaction* trans) { | |||
} | |||
++stats_.ooo_runs; | |||
|
|||
bool keep = trans->RunInShard(this); | |||
bool keep = trans->RunInShard(this, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should_run is also set for SUSPENDED_Q
.
Why not be more precise and pass trans_mask & Transaction::OUT_OF_ORDER
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, to avoid the situation where both bits are set, I suggest to add
sd.local_mask |= SUSPENDED_Q;
sd.local_mask &= ~OUT_OF_ORDER; /
in WatchInShard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True... I have actually a parallel PR where I remove the unwatch hop and make all wakeups single hop 🙄
But generally running blocked ooo should be safe because they have they run directly in PollExecution and should have only one hop after wakeups 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
Gave some comments.
Signed-off-by: Vladislav Oleshko <[email protected]>
@@ -243,6 +245,10 @@ class Transaction { | |||
return shard_data_[SidToId(sid)].local_mask; | |||
} | |||
|
|||
uint32_t GetLocalTxqPos(ShardId sid) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you do not need it. There is PqPos
for that. Remove either of them.
Fixes the long hunted blpop bug