Skip to content

Conversation

jimmygchen
Copy link
Member

@jimmygchen jimmygchen commented Aug 14, 2025

Issue Addressed

@jimmygchen jimmygchen requested a review from jxs as a code owner August 14, 2025 01:17
/// For non sampling columns, requirements is lower, but need to ensure there is peer for
/// publishing.
pub const MIN_SAMPLING_COLUMN_SUBNET_PEERS: u64 = TARGET_SUBNET_PEERS as u64;
pub const MIN_NON_SAMPLING_COLUMN_SUBNET_PEERS: u64 = 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need 1 peer and not 0 in this subnets?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it would be useful to have at least one peer per column in the case where we're publishing data columns.

@dapplion
Copy link
Collaborator

Did a first review pass and approach looks solid, just some minor comments

err
),
peer_action: Some(PeerAction::LowToleranceError),
})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pawanjay176 do we need this change?

This may be due to optimistic EL right?

// An honest optimistic node may propagate blocks which are rejected by an EE, do not
// penalize them.
ExecutionPayloadError::RejectedByExecutionEngine { .. } => false,

@@ -300,6 +300,7 @@ impl<E: EthSpec> PeerDB<E> {
.filter(move |(_, info)| {
// We check both the metadata and gossipsub data as we only want to count long-lived subscribed peers
info.is_connected()
&& info.is_synced_or_advanced()
Copy link
Member Author

@jimmygchen jimmygchen Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to think again about this change - this is used to determine if we have enough good peers when starting discovery query:

// Already have target number of peers, no need for subnet discovery
let peers_on_subnet = self
.network_globals
.peers
.read()
.good_peers_on_subnet(s.subnet)
.count();
if peers_on_subnet >= TARGET_SUBNET_PEERS {
trace!(
subnet = ?s.subnet,
reason = "Already connected to desired peers",

So the condition to push a PeerManagerEvent::DiscoverSubnetPeers below must also satisfy this

fn maintain_custody_peers(&mut self) {
let subnets_to_discover: Vec<SubnetDiscovery> = self
.network_globals
.sampling_subnets()
.iter()
.filter_map(|custody_subnet| {
if self
.network_globals
.peers
.read()
.has_good_peers_in_custody_subnet(
custody_subnet,
MIN_SAMPLING_COLUMN_SUBNET_PEERS as usize,
)
{
None
} else {
Some(SubnetDiscovery {
subnet: Subnet::DataColumn(*custody_subnet),
min_ttl: None,
})
}
})
.collect();

Otherwise it might result in pushing a DiscoverSubnetPeers event to network but then it gets ignored by network.

@jimmygchen jimmygchen added the ready-for-review The code is ready for review label Aug 15, 2025
Copy link

mergify bot commented Aug 15, 2025

Some required checks have failed. Could you please take a look @jimmygchen? 🙏

@mergify mergify bot added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Aug 15, 2025
@jimmygchen
Copy link
Member Author

@pawanjay176 I've done a bit of clean up, but there's still a bit of mess around maintaining peers & discovery:

#7876 (comment)

I'll revisit next week.

@jimmygchen jimmygchen self-assigned this Aug 18, 2025
jimmygchen added a commit that referenced this pull request Aug 18, 2025
Squashed commit of the following:

commit d8b6a99
Author: Jimmy Chen <[email protected]>
Date:   Mon Aug 18 23:48:28 2025 +1000

    Remove unnecessary test

commit 832d862
Author: Jimmy Chen <[email protected]>
Date:   Fri Aug 15 16:22:48 2025 +1000

    Clippy

commit 0437243
Merge: a250daa 317dc0f
Author: Jimmy Chen <[email protected]>
Date:   Fri Aug 15 16:22:30 2025 +1000

    Merge branch 'unstable' into fusaka-devnet-4-revive

commit a250daa
Author: Jimmy Chen <[email protected]>
Date:   Fri Aug 15 16:02:05 2025 +1000

    Clean up and refactor custody peer checks

commit ec7956b
Author: Jimmy Chen <[email protected]>
Date:   Fri Aug 15 14:40:06 2025 +1000

    Revert "Increase target peer count to 200."

    This reverts commit 5f38f31.

commit cdc417f
Author: Jimmy Chen <[email protected]>
Date:   Thu Aug 14 23:39:10 2025 +1000

    Prevent sampling subnets from getting pruned if below min target threshold.

commit 9ce9318
Author: Jimmy Chen <[email protected]>
Date:   Thu Aug 14 17:39:57 2025 +1000

    Check for sync status for custody subnet discovery

commit 1b8f0e4
Author: Jimmy Chen <[email protected]>
Date:   Thu Aug 14 17:30:30 2025 +1000

    Fix tests.

commit 1465d72
Author: Jimmy Chen <[email protected]>
Date:   Thu Aug 14 11:47:33 2025 +1000

    Only send lookup requests to peers that are synced or advacned.

commit 5f38f31
Author: Jimmy Chen <[email protected]>
Date:   Thu Aug 14 11:30:04 2025 +1000

    Increase target peer count to 200.

commit 56ca5bf
Author: Jimmy Chen <[email protected]>
Date:   Thu Aug 14 11:14:20 2025 +1000

    Remove penalties for block sidecar coupling.

commit 8e95a54
Author: Jimmy Chen <[email protected]>
Date:   Thu Aug 14 11:07:50 2025 +1000

    Revert "Reduce number of head chains we sync to"

    This reverts commit 1f776a4

commit 0144205
Author: Pawan Dhananjay <[email protected]>
Date:   Wed Aug 13 17:33:47 2025 -0700

    Request columns from global peer pool

commit a8f6801
Author: Pawan Dhananjay <[email protected]>
Date:   Wed Aug 13 16:37:12 2025 -0700

    Priorotize status v2

commit c8bbe47
Author: Pawan Dhananjay <[email protected]>
Date:   Wed Aug 13 16:36:56 2025 -0700

    Penalize if invalid EL block

commit 1f776a4
Author: Pawan Dhananjay <[email protected]>
Date:   Wed Aug 13 16:36:13 2025 -0700

    Reduce number of head chains we sync to
mergify bot pushed a commit that referenced this pull request Aug 21, 2025
Prioritise `StatusV2` over `StatusV1` RPC protocol.

A bug discovered during devnet-4 testing and extracted from the sync fixes PR #7876.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge waiting-on-author The reviewer has suggested changes and awaits thier implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants