Skip to content

Conversation

@carllin
Copy link
Contributor

@carllin carllin commented Mar 16, 2025

Problem

  1. block_id() is None on your own leader blocks, so highest_frozen_bank.block_id().unwrap() was panicking
  2. assert!(poh_start_slot < highest_frozen_bank.slot()); was panicking because it's possible that poh finishes the last tick for leader bank X+1, calls flush_tick_cache(), which will set the poh_start_slot to bank X+1. This means it's possible for the start_slot to be X+1 before replay_stage detects the bank is finished and freezes the bank, which makes it possible that start_slot = X+1 and the highest_frozen_slot = X.

Summary of Changes

  1. Make block id default for now, not sure how to handle this for leader blocks
  2. Removed the assert in reset poh logic

Fixes #

@carllin carllin requested a review from wen-coding March 16, 2025 05:16
// because we only ever start leader banks from parents that are
// frozen.
assert!(poh_start_slot < highest_frozen_bank.slot());
if poh_start_slot < highest_frozen_bank.slot() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I thought you said you wanted to remove this code, why do we need it in Alpenglow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's necessary to keep the poh up to date for banking stage would_be_leader() checks

Ideally those get replaced eventually by the separate skip loop timer which tracks the current slot

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comment saying why we want this now and when we plan to remove it.

AlpenglowVote::new_finalization_vote(
highest_frozen_bank.slot(),
highest_frozen_bank.block_id().unwrap(),
Hash::default(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have block_id now? I thought you want block_id when you are not the leader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it exists right now, i'll replicate that behavior

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I think you want block_id().unwrap_or() rather than Hash::default()? And add it to comment we need to fix it for leader block in the future (maybe file an issue)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hash::default()
. looks like leaders will be Hash::default()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm why? The leaders should know block_id when the block is complete and it wants to endorse its own block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh and this PR description: anza-xyz/agave#2776 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i remember now, it's because

  1. Bank tick is registered in Poh
  2. Replay detects the bank ticks are full, freezes the bank
  3. Shred is created by broadcast

There is no guarantee 3 happens before 2, so it's not necessarily possible for the leader to compute the block id

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer leader votes with its own block-id if possible, I would still say chat with Ashwin to see if we can do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"or to simply resend the vote once the last shred is finished (this would involve messing with the timestamp or deduplication code)."

I think this was the original plan

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can also hold the vote, I don't think we need to hold the vote for that long, shred in broadcast should be fast?

@carllin carllin merged commit efc48d5 into anza-xyz:master Mar 17, 2025
7 checks passed
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 1, 2025
bw-solana pushed a commit to bw-solana/alpenglow that referenced this pull request Aug 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants