Skip to content

Re-bootstrap loop in case of ALL_BROKERS_DOWN #5088

@emasab

Description

@emasab

In 2.10.0 the client is able to remove the brokers that aren't present in a metadata response and re-bootstrap in case it reaches the state where all brokers are down.
When all brokers are down a loop of re-bootstrap sequences can be started, given that the bootstrap brokers are removed and added again on re-bootstrap, the learned ones have > 0 connections and the connection strategy prefers brokers with no connection.

So the client continues preferring those bootstrap brokers that are then removed causing a new re-bootstrap that adds them again.
The logs looks like:

%7|1747929809.046|CONNECT|0151_purge_brokers_mock#producer-2| [thrd:main]: 127.0.0.1:38759/bootstrap: Selected for cluster connection: no cluster connection (broker has 0 connection attempt(s))
%7|1747929809.046|CONNECT|0151_purge_brokers_mock#producer-2| [thrd:main]: Not selecting any broker for cluster connection: still suppressed for 49ms: periodic broker list refresh
%7|1747929809.046|CONNECT|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: 127.0.0.1:38759/bootstrap: Received CONNECT op
%7|1747929809.046|STATE|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: 127.0.0.1:38759/bootstrap: Broker changed state INIT -> TRY_CONNECT
%7|1747929809.046|BROADCAST|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: Broadcasting state change
%7|1747929809.046|CONNECT|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: 127.0.0.1:38759/bootstrap: broker in state TRY_CONNECT connecting
%7|1747929809.046|STATE|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: 127.0.0.1:38759/bootstrap: Broker changed state TRY_CONNECT -> CONNECT
%7|1747929809.046|BROADCAST|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: Broadcasting state change
%7|1747929809.047|CONNECT|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: 127.0.0.1:38759/bootstrap: Connecting to ipv4#127.0.0.1:38759 (plaintext) with socket 15
%7|1747929809.047|FAIL|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: 127.0.0.1:38759/bootstrap: Connect to ipv4#127.0.0.1:38759 failed: Connection refused (after 0ms in state CONNECT) (_TRANSPORT)
%3|1747929809.047|FAIL|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: 127.0.0.1:38759/bootstrap: Connect to ipv4#127.0.0.1:38759 failed: Connection refused (after 0ms in state CONNECT)
%7|1747929809.047|STATE|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: 127.0.0.1:38759/bootstrap: Broker changed state CONNECT -> DOWN
%7|1747929809.047|BROADCAST|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: Broadcasting state change
%7|1747929809.047|BUFQ|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: 127.0.0.1:38759/bootstrap: Purging bufq with 0 buffers
%7|1747929809.047|BUFQ|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: 127.0.0.1:38759/bootstrap: Purging bufq with 0 buffers
%7|1747929809.047|BUFQ|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: 127.0.0.1:38759/bootstrap: Updating 0 buffers on connection reset
%7|1747929809.047|STATE|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: 127.0.0.1:38759/bootstrap: Broker changed state DOWN -> INIT
%7|1747929809.047|BROADCAST|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38759/bootstrap]: Broadcasting state change
%7|1747929809.047|REBOOTSTRAP|0151_purge_brokers_mock#producer-2| [thrd:main]: Starting re-bootstrap sequence

%7|1747929817.973|CONNECT|0151_purge_brokers_mock#producer-2| [thrd:main]: 127.0.0.1:38169/bootstrap: Selected for cluster connection: no cluster connection (broker has 0 connection attempt(s))
%7|1747929817.973|CONNECT|0151_purge_brokers_mock#producer-2| [thrd:main]: Not selecting any broker for cluster connection: still suppressed for 49ms: periodic broker list refresh
%7|1747929817.973|CONNECT|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: 127.0.0.1:38169/bootstrap: Received CONNECT op
%7|1747929817.973|STATE|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: 127.0.0.1:38169/bootstrap: Broker changed state INIT -> TRY_CONNECT
%7|1747929817.973|BROADCAST|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: Broadcasting state change
%7|1747929817.973|CONNECT|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: 127.0.0.1:38169/bootstrap: broker in state TRY_CONNECT connecting
%7|1747929817.973|STATE|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: 127.0.0.1:38169/bootstrap: Broker changed state TRY_CONNECT -> CONNECT
%7|1747929817.973|BROADCAST|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: Broadcasting state change
%7|1747929817.973|CONNECT|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: 127.0.0.1:38169/bootstrap: Connecting to ipv4#127.0.0.1:38169 (plaintext) with socket 15
%7|1747929817.973|FAIL|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: 127.0.0.1:38169/bootstrap: Connect to ipv4#127.0.0.1:38169 failed: Connection refused (after 0ms in state CONNECT) (_TRANSPORT)
%3|1747929817.973|FAIL|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: 127.0.0.1:38169/bootstrap: Connect to ipv4#127.0.0.1:38169 failed: Connection refused (after 0ms in state CONNECT)
%7|1747929817.973|STATE|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: 127.0.0.1:38169/bootstrap: Broker changed state CONNECT -> DOWN
%7|1747929817.973|BROADCAST|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: Broadcasting state change
%7|1747929817.973|BUFQ|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: 127.0.0.1:38169/bootstrap: Purging bufq with 0 buffers
%7|1747929817.973|BUFQ|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: 127.0.0.1:38169/bootstrap: Purging bufq with 0 buffers
%7|1747929817.973|BUFQ|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: 127.0.0.1:38169/bootstrap: Updating 0 buffers on connection reset
%7|1747929817.973|STATE|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: 127.0.0.1:38169/bootstrap: Broker changed state DOWN -> INIT
%7|1747929817.973|BROADCAST|0151_purge_brokers_mock#producer-2| [thrd:127.0.0.1:38169/bootstrap]: Broadcasting state change
%7|1747929817.973|REBOOTSTRAP|0151_purge_brokers_mock#producer-2| [thrd:main]: Starting re-bootstrap sequence

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions