-
Notifications
You must be signed in to change notification settings - Fork 29
Fix some panicking code #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // because we only ever start leader banks from parents that are | ||
| // frozen. | ||
| assert!(poh_start_slot < highest_frozen_bank.slot()); | ||
| if poh_start_slot < highest_frozen_bank.slot() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I thought you said you wanted to remove this code, why do we need it in Alpenglow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's necessary to keep the poh up to date for banking stage would_be_leader() checks
Ideally those get replaced eventually by the separate skip loop timer which tracks the current slot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comment saying why we want this now and when we plan to remove it.
core/src/replay_stage.rs
Outdated
| AlpenglowVote::new_finalization_vote( | ||
| highest_frozen_bank.slot(), | ||
| highest_frozen_bank.block_id().unwrap(), | ||
| Hash::default(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have block_id now? I thought you want block_id when you are not the leader?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it exists right now, i'll replicate that behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I think you want block_id().unwrap_or() rather than Hash::default()? And add it to comment we need to fix it for leader block in the future (maybe file an issue)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alpenglow/core/src/consensus.rs
Line 627 in 6770619
| Hash::default() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm why? The leaders should know block_id when the block is complete and it wants to endorse its own block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh and this PR description: anza-xyz/agave#2776 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i remember now, it's because
- Bank tick is registered in Poh
- Replay detects the bank ticks are full, freezes the bank
- Shred is created by broadcast
There is no guarantee 3 happens before 2, so it's not necessarily possible for the leader to compute the block id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer leader votes with its own block-id if possible, I would still say chat with Ashwin to see if we can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"or to simply resend the vote once the last shred is finished (this would involve messing with the timestamp or deduplication code)."
I think this was the original plan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can also hold the vote, I don't think we need to hold the vote for that long, shred in broadcast should be fast?
Problem
block_id()is None on your own leader blocks, sohighest_frozen_bank.block_id().unwrap()was panickingassert!(poh_start_slot < highest_frozen_bank.slot());was panicking because it's possible that poh finishes the last tick for leader bankX+1, calls flush_tick_cache(), which will set the poh_start_slot to bankX+1. This means it's possible for the start_slot to beX+1before replay_stage detects the bank is finished and freezes the bank, which makes it possible that start_slot =X+1and the highest_frozen_slot =X.Summary of Changes
Fixes #