Skip to content

ConcurrentDial prematurely gives up on dials #6105

@retrohacker

Description

@retrohacker

Summary

ConcurrentDial takes a Vec<Dial> and resolves a future with the first address that successfully connects, returning the Multiaddr that was dialed and Ok((PeerId, StreamMuxerBox)) on success. This is wrapped in a task that returns the result to the Pool.

If another peer is listening on that address, the Pool will return PendingOutboundConnectionError::WrongPeerId.

This seems like the multiaddresses are racing to connect, and the first connection is selected regardless of if that address was valid for the dial attempt. The problem is that one of the other addresses in ConcurrentDial might be the address of the correct peer, but at this point we've given up on all other dials and selected the wrong address. So instead of connecting to the correct address, we incorrectly return the WrongPeerId error for the connection that won the race.

Expected behavior

ConcurrentDial should be aware of the PeerId that is being dialed. Connections that would result in a WrongPeerId error should not halt the ConcurrentDial future and should instead continue attempting to establish a valid outgoing connection to the peer we are looking for.

Actual behavior

ConcurrentDial races addresses and will select connections to incorrect peers resolving the future. This results in failing the connection attempt with WrongPeerId.

Relevant log output

Possible Solution

No response

Version

No response

Would you like to work on fixing this bug?

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions