-
Notifications
You must be signed in to change notification settings - Fork 29
Add alpenglow poh #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
poh/src/poh_service.rs
Outdated
| } | ||
| } | ||
|
|
||
| if let Some(CurrentLeaderBank { bank, start }) = ¤t_leader_bank { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let Some() else {
continue;
}
would be nice here to reduce nesting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced!
poh/src/poh_recorder.rs
Outdated
| last_reported_slot_for_pending_fork: Arc<Mutex<Slot>>, | ||
| pub is_exited: Arc<AtomicBool>, | ||
| pub is_alpenglow_enabled: bool, | ||
| pub is_poh_service_migrated: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not sure what "is_poh_service_migrated" means here, and how it's different from is_alpenglow_enabled above. How about names like "use_alpenglow_tick_produer"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced!
poh/src/poh_recorder.rs
Outdated
| )) | ||
| } | ||
|
|
||
| pub fn migrate_poh_to_alpenglow(&mut self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Are we doing anything other than setting PoH to low power mode here? This name is vague.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just migrating to the alpenglow poh tick producer, renamed to migrate_to_alpenglow_poh
| self.tick_lock_contention_us += tick_lock_contention_us; | ||
|
|
||
| if let Some(poh_entry) = poh_entry { | ||
| self.tick_height = slot_max_tick_height; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, so the plan is we will only have one tick forever in all Alpenglow blocks? Do we need to change tick verification in replay at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah just one tick in all alpenglow blocks to signal the end of a slot
Yeah, need to remove tick verification in replay when alpenglow is enabled, that can be done in another PR
poh/src/poh_service.rs
Outdated
| } | ||
| Self::alpenglow_tick_producer(poh_recorder, &poh_exit, record_receiver); | ||
| } | ||
| //} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
| while !poh_exit.load(Ordering::Relaxed) { | ||
| // Wait for a new leader bank to be set in PohRecorder | ||
| let leader_bank = leader_bank_notifier | ||
| .get_or_wait_for_in_progress(Duration::from_millis(50)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you can't do much if you don't have a leader bank, is 50ms a good time interval to wait here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think other components like banking_stage used 50ms, so I used that. It just needs to be reasonable enough where we can occasionally check the exit condition for poh_service
poh/src/poh_service.rs
Outdated
| let tick_producer = Builder::new() | ||
| .name("solPohTickProd".to_string()) | ||
| .spawn(move || { | ||
| if poh_config.hashes_per_tick.is_none() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Alpenglow is already enabled, should we skip this if statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, thats a good point, should skip the migration if we have a frozen bank on startup greater than the first alpenglow slot, added!
| bank: leader_bank.clone(), | ||
| // By this point the leader should have committed their certificates, | ||
| // so it's safe to start the timer | ||
| start: Instant::now(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are packing a bank then the validator got restarted in the middle, what will happen? I'm guessing we will just repack the whole thing, so start will be correct again?
Is it possible that we already sent out some shreds then the validator is restarted before the leader bank finishes? Would the start be wrong here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just like today we always check blockstore to see if we have existing shreds before we create a leader bank
| // so it's safe to start the timer | ||
| start: Instant::now(), | ||
| }); | ||
| info!("current ns per slot: {}", leader_bank.ns_per_slot); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this value change? Do we need to log every time we start a leader bank?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can adjust Poh difficulty today, so I think it would be nice to be able to adjust slot times if we ever want to shorten them, which was on the roadmap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic looks correct for prototype, ![]()
PohService needs to set `use_alpenglow_tick_producer` flag on startup (#59)
PohService needs to set `use_alpenglow_tick_producer` flag on startup (#59)
PohService needs to set `use_alpenglow_tick_producer` flag on startup (anza-xyz#59)
PohService needs to set `use_alpenglow_tick_producer` flag on startup (anza-xyz#59)
PohService needs to set `use_alpenglow_tick_producer` flag on startup (anza-xyz#59)
PohService needs to set `use_alpenglow_tick_producer` flag on startup (anza-xyz#59)
PohService needs to set `use_alpenglow_tick_producer` flag on startup (anza-xyz#59)
PohService needs to set `use_alpenglow_tick_producer` flag on startup (anza-xyz#59)
PohService needs to set `use_alpenglow_tick_producer` flag on startup (anza-xyz#59)
Problem
Poh needlessly hashes/complicated for alpenglow
Summary of Changes
max_tick_height - 1right before it ticks to mimick bank being completeTested the above, this is sufficient for a single leader to continually produce Alpenglow blocks on a single node network because replay won't verify those blocks
TODO:
Change replay verification on migration to stop counting number of ticks/verifying hashes. Only thing necessary is entry count and reading the last tick as a signal for the end of the block
Fixes #