Skip to content

GossipSub PX shares unreachable peers #617

@dennis-tra

Description

@dennis-tra

A few weeks ago we added GossipSub "crawling" support to Nebula [link]. This works as follows:

  1. Connect to a peer that supports GossipSub
  2. Wait until the other peer opens the GossipSub (/meshsub/...) stream to us and send the hello RPC. This RPC contains the active subscriptions of the remote peer.
  3. Send the same hello message back to the remote peer (containing the same subscriptions)
  4. Send a graft message to the remote peer that contains all subscriptions
  5. Wait for the other peer to send a prune message that then contains additional peers which we can dial.

We are only running this routine for the Filecoin network so far because that's the only network we know of that makes use of the PX feature.

With the above technique we were able to identify ~50% additional peers in the Filecoin network:

PX Disabled: crawlDuration=31.395986409s crawledPeers=684
PX Enabled:  crawlDuration=51.537761043s crawledPeers=1015

However, virtually all of them are unreachable (>97%):

Image

This graph shows the error distribution when trying to dial the peers that we have identified via GossipSub [source]. There are two error cases that stand out:

  • no_public_address means that the set of multiaddresses which we have learned through PX does only contain addresses from private IP ranges, e.g., 127.0.0.1. Thus Nebula does not try to dial these peers.
  • io_timeout means that connecting to the peer timed out (5s dial timeout)

I'm not super familiar with the internals of the GossipSub PX but I would propose to add:

  1. Some configuration similar to the various DHT routing table/query/diversity filter here which checks that a peer satisfies some requirements before they are added to the PX pool (like: has at least one public IP address).
  2. Periodically checks the pool of PX peers for reachability and removes them from the list (maybe that's already there?)

If both things would solve the problems, according to the above numbers, this would render PX in Filecoin basically useless because there would only be a handful peers being shared around.

I just wanted to start the discussion here and get some feedback 👍

References:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions