Skip to content

Commit b4a0ddb

Browse files
feat(retry): Add retry logic with exponential backoff for HTTP requests (#20)
* feat(retry): Add retry logic with exponential backoff for HTTP requests - Add RetryConfig type with configurable retry parameters - Update ClientOptions to include RetryConfig field - Implement retry helper functions for error detection and backoff calculation - Add executeWithRetry method with exponential backoff logic - Integrate retry logic into all HTTP methods (ListModels, ListProviderModels, ListTools, GenerateContent, GenerateContentStream, HealthCheck) - Add comprehensive tests covering various retry scenarios - Default retry configuration: enabled with 3 max attempts, 2s initial backoff, 30s max backoff, 2x multiplier - Retry on status codes: 408, 429, 500, 502, 503, 504 - Retry on network errors: timeouts, connection failures, DNS errors - Respect context cancellation during retries Co-authored-by: Eden Reich <[email protected]> * refactor(retry): Clean up comments Signed-off-by: Eden Reich <[email protected]> * feat(retry): Add configurable status codes and callback mechanism - Add RetryableStatusCodes field to RetryConfig for custom status codes - Add OnRetry callback for retry event logging and monitoring - Maintain backward compatibility with default behavior - Update isRetryableStatusCode to use configurable codes - Add comprehensive test coverage for new functionality Co-authored-by: Eden Reich <[email protected]> * chore: Remove redundant comments * chore: Testing the release (#22) * refactor(tests): Remove unnecessary blank line in TestIsRetryableStatusCode Signed-off-by: Eden Reich <[email protected]> * chore(release): 🔖 1.12.0-rc.1 [skip ci] ## [1.12.0-rc.1](v1.11.1...v1.12.0-rc.1) (2025-08-22) ### ✨ Features * **retry:** Add configurable status codes and callback mechanism ([1d3cefa](1d3cefa)) * **retry:** Add retry logic with exponential backoff for HTTP requests ([0d1a57a](0d1a57a)) ### ♻️ Improvements * **retry:** Clean up comments ([b5a32db](b5a32db)) * **tests:** Remove unnecessary blank line in TestIsRetryableStatusCode ([223ab26](223ab26)) ### 🔧 Miscellaneous * Remove redundant comments ([fbbe49f](fbbe49f)) --------- Signed-off-by: Eden Reich <[email protected]> * refactor(retry): Remove redundant comments in isRetryableStatusCode function Signed-off-by: Eden Reich <[email protected]> * chore: Testing the release (#23) * fix(headers): Remove redundant comment in TestWithHeaders Signed-off-by: Eden Reich <[email protected]> * refactor(retry): Remove comments from retryable status code tests for clarity Signed-off-by: Eden Reich <[email protected]> * chore(release): 🔖 1.12.0-rc.1 [skip ci] ## [1.12.0-rc.1](v1.11.1...v1.12.0-rc.1) (2025-08-22) ### ✨ Features * **retry:** Add configurable status codes and callback mechanism ([1d3cefa](1d3cefa)) * **retry:** Add retry logic with exponential backoff for HTTP requests ([0d1a57a](0d1a57a)) ### ♻️ Improvements * **retry:** Clean up comments ([b5a32db](b5a32db)) ### 🐛 Bug Fixes * **headers:** Remove redundant comment in TestWithHeaders ([a6a6cbb](a6a6cbb)) ### 🔧 Miscellaneous * Remove redundant comments ([fbbe49f](fbbe49f)) * Testing the release ([#22](#22)) ([05b9687](05b9687)) * chore(release): 🔖 1.12.0-rc.1 [skip ci] ## [1.12.0-rc.1](v1.11.1...v1.12.0-rc.1) (2025-08-22) ### ✨ Features * **retry:** Add configurable status codes and callback mechanism ([1d3cefa](1d3cefa)) * **retry:** Add retry logic with exponential backoff for HTTP requests ([0d1a57a](0d1a57a)) ### ♻️ Improvements * **retry:** Clean up comments ([b5a32db](b5a32db)) * **retry:** Remove comments from retryable status code tests for clarity ([5948350](5948350)) * **retry:** Remove redundant comments in isRetryableStatusCode function ([930dc15](930dc15)) ### 🐛 Bug Fixes * **headers:** Remove redundant comment in TestWithHeaders ([468f356](468f356)) ### 🔧 Miscellaneous * **release:** 🔖 1.12.0-rc.1 [skip ci] ([d36f1ab](d36f1ab)) * Remove redundant comments ([fbbe49f](fbbe49f)) * Testing the release ([#22](#22)) ([05b9687](05b9687)) * feat: Add Retry Mechanism section to README and implement parseRetryAfter function with tests also for rate-limiting retries Signed-off-by: Eden Reich <[email protected]> * chore(release): 🔖 1.12.0-rc.2 [skip ci] ## [1.12.0-rc.2](v1.12.0-rc.1...v1.12.0-rc.2) (2025-08-22) ### ✨ Features * Add Retry Mechanism section to README and implement parseRetryAfter function with tests also for rate-limiting retries ([ce74bad](ce74bad)) --------- Signed-off-by: Eden Reich <[email protected]> --------- Signed-off-by: Eden Reich <[email protected]> Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> Co-authored-by: Eden Reich <[email protected]>
1 parent 57bd526 commit b4a0ddb

File tree

5 files changed

+1127
-42
lines changed

5 files changed

+1127
-42
lines changed

CHANGELOG.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,71 @@
22

33
All notable changes to this project will be documented in this file.
44

5+
## [1.12.0-rc.2](https://github.com/inference-gateway/sdk/compare/v1.12.0-rc.1...v1.12.0-rc.2) (2025-08-22)
6+
7+
### ✨ Features
8+
9+
* Add Retry Mechanism section to README and implement parseRetryAfter function with tests also for rate-limiting retries ([ce74bad](https://github.com/inference-gateway/sdk/commit/ce74badc6a0b04b66e1972157535319b3050fdca))
10+
11+
## [1.12.0-rc.1](https://github.com/inference-gateway/sdk/compare/v1.11.1...v1.12.0-rc.1) (2025-08-22)
12+
13+
### ✨ Features
14+
15+
* **retry:** Add configurable status codes and callback mechanism ([1d3cefa](https://github.com/inference-gateway/sdk/commit/1d3cefa4a8264fea954267cc8abc29c5a90e2f17))
16+
* **retry:** Add retry logic with exponential backoff for HTTP requests ([0d1a57a](https://github.com/inference-gateway/sdk/commit/0d1a57af0b55bcffa4b70f6e790f71707f521960))
17+
18+
### ♻️ Improvements
19+
20+
* **retry:** Clean up comments ([b5a32db](https://github.com/inference-gateway/sdk/commit/b5a32db07aa88e943ed893e0607b8aaba2cf29dc))
21+
* **retry:** Remove comments from retryable status code tests for clarity ([5948350](https://github.com/inference-gateway/sdk/commit/59483508e16dc5e4dcb73ad6ce6aa5a01b73432f))
22+
* **retry:** Remove redundant comments in isRetryableStatusCode function ([930dc15](https://github.com/inference-gateway/sdk/commit/930dc1577df93ab6e2238d969e42157a80c78c0f))
23+
24+
### 🐛 Bug Fixes
25+
26+
* **headers:** Remove redundant comment in TestWithHeaders ([468f356](https://github.com/inference-gateway/sdk/commit/468f35616933900d946c9e2dfa76d47c314aa792))
27+
28+
### 🔧 Miscellaneous
29+
30+
* **release:** 🔖 1.12.0-rc.1 [skip ci] ([d36f1ab](https://github.com/inference-gateway/sdk/commit/d36f1ab08f77789e67db018f825b9ae5965793c8))
31+
* Remove redundant comments ([fbbe49f](https://github.com/inference-gateway/sdk/commit/fbbe49fc51cd74db5f9414655bef0558c5e1b0a4))
32+
* Testing the release ([#22](https://github.com/inference-gateway/sdk/issues/22)) ([05b9687](https://github.com/inference-gateway/sdk/commit/05b9687049680fafeeb95d153b2ebc9b7e2be055))
33+
34+
## [1.12.0-rc.1](https://github.com/inference-gateway/sdk/compare/v1.11.1...v1.12.0-rc.1) (2025-08-22)
35+
36+
### ✨ Features
37+
38+
* **retry:** Add configurable status codes and callback mechanism ([1d3cefa](https://github.com/inference-gateway/sdk/commit/1d3cefa4a8264fea954267cc8abc29c5a90e2f17))
39+
* **retry:** Add retry logic with exponential backoff for HTTP requests ([0d1a57a](https://github.com/inference-gateway/sdk/commit/0d1a57af0b55bcffa4b70f6e790f71707f521960))
40+
41+
### ♻️ Improvements
42+
43+
* **retry:** Clean up comments ([b5a32db](https://github.com/inference-gateway/sdk/commit/b5a32db07aa88e943ed893e0607b8aaba2cf29dc))
44+
45+
### 🐛 Bug Fixes
46+
47+
* **headers:** Remove redundant comment in TestWithHeaders ([a6a6cbb](https://github.com/inference-gateway/sdk/commit/a6a6cbbc8f69eb6f5cfe8b1d2fcc1e33527dc9e9))
48+
49+
### 🔧 Miscellaneous
50+
51+
* Remove redundant comments ([fbbe49f](https://github.com/inference-gateway/sdk/commit/fbbe49fc51cd74db5f9414655bef0558c5e1b0a4))
52+
* Testing the release ([#22](https://github.com/inference-gateway/sdk/issues/22)) ([05b9687](https://github.com/inference-gateway/sdk/commit/05b9687049680fafeeb95d153b2ebc9b7e2be055))
53+
54+
## [1.12.0-rc.1](https://github.com/inference-gateway/sdk/compare/v1.11.1...v1.12.0-rc.1) (2025-08-22)
55+
56+
### ✨ Features
57+
58+
* **retry:** Add configurable status codes and callback mechanism ([1d3cefa](https://github.com/inference-gateway/sdk/commit/1d3cefa4a8264fea954267cc8abc29c5a90e2f17))
59+
* **retry:** Add retry logic with exponential backoff for HTTP requests ([0d1a57a](https://github.com/inference-gateway/sdk/commit/0d1a57af0b55bcffa4b70f6e790f71707f521960))
60+
61+
### ♻️ Improvements
62+
63+
* **retry:** Clean up comments ([b5a32db](https://github.com/inference-gateway/sdk/commit/b5a32db07aa88e943ed893e0607b8aaba2cf29dc))
64+
* **tests:** Remove unnecessary blank line in TestIsRetryableStatusCode ([223ab26](https://github.com/inference-gateway/sdk/commit/223ab2679d3cfdc3080f3f1e4d5588cccf573bf9))
65+
66+
### 🔧 Miscellaneous
67+
68+
* Remove redundant comments ([fbbe49f](https://github.com/inference-gateway/sdk/commit/fbbe49fc51cd74db5f9414655bef0558c5e1b0a4))
69+
570
## [1.11.1](https://github.com/inference-gateway/sdk/compare/v1.11.0...v1.11.1) (2025-08-20)
671

772
### 📚 Documentation

README.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Connect to multiple LLM providers through a unified interface • Stream respons
2424
- [Usage](#usage)
2525
- [Creating a Client](#creating-a-client)
2626
- [Using Custom Headers](#using-custom-headers)
27+
- [Retry Mechanism](#retry-mechanism)
2728
- [Middleware Options](#middleware-options)
2829
- [Listing Models](#listing-models)
2930
- [Listing MCP Tools](#listing-mcp-tools)
@@ -120,6 +121,89 @@ client = client.WithHeaders(map[string]string{
120121
response, err := client.GenerateContent(ctx, provider, model, messages)
121122
```
122123

124+
### Retry Mechanism
125+
126+
The SDK includes a built-in retry mechanism for handling transient failures and network issues. By default, the client will automatically retry requests that fail with retryable status codes.
127+
128+
**Default Retry Configuration:**
129+
130+
```go
131+
client := sdk.NewClient(&sdk.ClientOptions{
132+
BaseURL: "http://localhost:8080/v1",
133+
// Default retry configuration is automatically applied
134+
})
135+
```
136+
137+
The default configuration includes:
138+
- **Max Retries:** 3 attempts
139+
- **Timeout:** 30 seconds per request
140+
- **Backoff Strategy:** Exponential backoff with jitter
141+
- **Retryable Status Codes:** 429 (Too Many Requests), 500 (Internal Server Error), 502 (Bad Gateway), 503 (Service Unavailable), 504 (Gateway Timeout)
142+
143+
**Custom Retry Configuration:**
144+
145+
You can customize the retry behavior by providing your own retry options:
146+
147+
```go
148+
client := sdk.NewClient(&sdk.ClientOptions{
149+
BaseURL: "http://localhost:8080/v1",
150+
RetryOptions: &sdk.RetryOptions{
151+
MaxRetries: 5, // Maximum number of retry attempts
152+
Timeout: time.Duration(60) * time.Second, // Timeout per request
153+
MinDelay: time.Duration(1) * time.Second, // Minimum delay between retries
154+
MaxDelay: time.Duration(30) * time.Second, // Maximum delay between retries
155+
RetryableStatusCodes: []int{429, 500, 502, 503, 504}, // HTTP status codes to retry
156+
},
157+
})
158+
```
159+
160+
**Exponential Backoff with Jitter:**
161+
162+
The retry mechanism uses exponential backoff with jitter to prevent thundering herd problems. The delay between retries is calculated as:
163+
164+
1. Base delay starts at `MinDelay` and doubles with each retry
165+
2. Capped at `MaxDelay` to prevent excessive waiting
166+
3. Random jitter (±25%) is added to spread out retry attempts
167+
168+
Example delay sequence (with 1s MinDelay, 30s MaxDelay):
169+
- 1st retry: ~1s (0.75s - 1.25s with jitter)
170+
- 2nd retry: ~2s (1.5s - 2.5s with jitter)
171+
- 3rd retry: ~4s (3s - 5s with jitter)
172+
- 4th retry: ~8s (6s - 10s with jitter)
173+
- 5th retry: ~16s (12s - 20s with jitter)
174+
175+
**Disabling Retries:**
176+
177+
To disable automatic retries, set `MaxRetries` to 0:
178+
179+
```go
180+
client := sdk.NewClient(&sdk.ClientOptions{
181+
BaseURL: "http://localhost:8080/v1",
182+
RetryOptions: &sdk.RetryOptions{
183+
MaxRetries: 0, // Disables retries
184+
},
185+
})
186+
```
187+
188+
**Rate Limiting (429 Status):**
189+
190+
When the server returns a 429 (Too Many Requests) status code, the SDK will:
191+
1. Check for a `Retry-After` header
192+
2. If present, wait for the specified duration before retrying
193+
3. If not present, use the standard exponential backoff strategy
194+
195+
**Context and Cancellation:**
196+
197+
Retries respect the context passed to API methods. If the context is cancelled or times out, retries will stop immediately:
198+
199+
```go
200+
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute)
201+
defer cancel()
202+
203+
// Retries will stop if the context times out
204+
response, err := client.GenerateContent(ctx, provider, model, messages)
205+
```
206+
123207
### Middleware Options
124208

125209
The Inference Gateway supports various middleware layers (MCP tools, A2A agents) that can be bypassed for direct provider access. The SDK provides `WithMiddlewareOptions` to control middleware behavior:

0 commit comments

Comments
 (0)