Skip to content

Conversation

QuChen88
Copy link

@QuChen88 QuChen88 commented Mar 27, 2025

This is a revival of the previous PR #54

Previous PR was outdated and requires a rebase on the latest code so it can be cleanly merged

@dormando
Copy link
Collaborator

dormando commented Apr 4, 2025

discussed with @QuChen88 elsewhere - they're going to fix up the change more and re-submit.

@QuChen88 QuChen88 changed the title Exponential backoff and dial retrying. Use more efficient pooling Exponential backoff and dial retrying. Apr 4, 2025
@QuChen88 QuChen88 changed the title Exponential backoff and dial retrying. Exponential backoff and dial retrying Apr 4, 2025
@QuChen88
Copy link
Author

QuChen88 commented Apr 4, 2025

Re-submitted the commit with only the exponential backoff and dial retrying change, and without the connection pooling change.

@dormando
Copy link
Collaborator

dormando commented Apr 4, 2025

@QuChen88 Thanks! Existing tests pass.

Any chance you could add specific tests to verify this feature?

@dormando
Copy link
Collaborator

dormando commented Apr 4, 2025

Almost forgot: #115 - this looks like a similar issue with a different solution, a circuit breaker instead of a manual backoff. @QuChen88 did you see this other issue perhaps and have any thoughts on this?

The change they made pulls in a CB repo and seems more intrusive. I'm not sure offhand which approach is cleanest.

@QuChen88
Copy link
Author

QuChen88 commented Apr 4, 2025

It is not easy to add a reliable test for this as this is mainly about retry logic on connection failure.

For #115, looks like it is related to the PR for custom dialer which was already merged back in 2023 (See #158). And it appears to be trying to solve a similar problem of not aggressively retry when there are issues with the server.

I personally prefer to retry from outside of dial() with backoff logic.

@dormando
Copy link
Collaborator

dormando commented Apr 4, 2025

Can you test against a non-existent server and ensure it can't reconnect instantly? that shouldn't be too hard.

If it is we might have to fix some other things first (looks like it's eating errors in certain conditions)

@@ -71,6 +71,8 @@ const (
// DefaultMaxIdleConns is the default maximum number of idle connections
// kept for any single address.
DefaultMaxIdleConns = 2
// maximum number of times to attempt to reconnect
maxRetries = 10

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @QuChen88 ! Could this setting be exportable/settable as an option by the lib user? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants