You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add Retry Mechanism section to README and implement parseRetryAfter function with tests also for rate-limiting retries ([ce74bad](https://github.com/inference-gateway/sdk/commit/ce74badc6a0b04b66e1972157535319b3050fdca))
***retry:** Add configurable status codes and callback mechanism ([1d3cefa](https://github.com/inference-gateway/sdk/commit/1d3cefa4a8264fea954267cc8abc29c5a90e2f17))
16
+
***retry:** Add retry logic with exponential backoff for HTTP requests ([0d1a57a](https://github.com/inference-gateway/sdk/commit/0d1a57af0b55bcffa4b70f6e790f71707f521960))
17
+
18
+
### ♻️ Improvements
19
+
20
+
***retry:** Clean up comments ([b5a32db](https://github.com/inference-gateway/sdk/commit/b5a32db07aa88e943ed893e0607b8aaba2cf29dc))
21
+
***retry:** Remove comments from retryable status code tests for clarity ([5948350](https://github.com/inference-gateway/sdk/commit/59483508e16dc5e4dcb73ad6ce6aa5a01b73432f))
22
+
***retry:** Remove redundant comments in isRetryableStatusCode function ([930dc15](https://github.com/inference-gateway/sdk/commit/930dc1577df93ab6e2238d969e42157a80c78c0f))
23
+
24
+
### 🐛 Bug Fixes
25
+
26
+
***headers:** Remove redundant comment in TestWithHeaders ([468f356](https://github.com/inference-gateway/sdk/commit/468f35616933900d946c9e2dfa76d47c314aa792))
***retry:** Add configurable status codes and callback mechanism ([1d3cefa](https://github.com/inference-gateway/sdk/commit/1d3cefa4a8264fea954267cc8abc29c5a90e2f17))
39
+
***retry:** Add retry logic with exponential backoff for HTTP requests ([0d1a57a](https://github.com/inference-gateway/sdk/commit/0d1a57af0b55bcffa4b70f6e790f71707f521960))
40
+
41
+
### ♻️ Improvements
42
+
43
+
***retry:** Clean up comments ([b5a32db](https://github.com/inference-gateway/sdk/commit/b5a32db07aa88e943ed893e0607b8aaba2cf29dc))
44
+
45
+
### 🐛 Bug Fixes
46
+
47
+
***headers:** Remove redundant comment in TestWithHeaders ([a6a6cbb](https://github.com/inference-gateway/sdk/commit/a6a6cbbc8f69eb6f5cfe8b1d2fcc1e33527dc9e9))
***retry:** Add configurable status codes and callback mechanism ([1d3cefa](https://github.com/inference-gateway/sdk/commit/1d3cefa4a8264fea954267cc8abc29c5a90e2f17))
59
+
***retry:** Add retry logic with exponential backoff for HTTP requests ([0d1a57a](https://github.com/inference-gateway/sdk/commit/0d1a57af0b55bcffa4b70f6e790f71707f521960))
60
+
61
+
### ♻️ Improvements
62
+
63
+
***retry:** Clean up comments ([b5a32db](https://github.com/inference-gateway/sdk/commit/b5a32db07aa88e943ed893e0607b8aaba2cf29dc))
64
+
***tests:** Remove unnecessary blank line in TestIsRetryableStatusCode ([223ab26](https://github.com/inference-gateway/sdk/commit/223ab2679d3cfdc3080f3f1e4d5588cccf573bf9))
The SDK includes a built-in retry mechanism for handling transient failures and network issues. By default, the client will automatically retry requests that fail with retryable status codes.
127
+
128
+
**Default Retry Configuration:**
129
+
130
+
```go
131
+
client:= sdk.NewClient(&sdk.ClientOptions{
132
+
BaseURL: "http://localhost:8080/v1",
133
+
// Default retry configuration is automatically applied
134
+
})
135
+
```
136
+
137
+
The default configuration includes:
138
+
-**Max Retries:** 3 attempts
139
+
-**Timeout:** 30 seconds per request
140
+
-**Backoff Strategy:** Exponential backoff with jitter
141
+
-**Retryable Status Codes:** 429 (Too Many Requests), 500 (Internal Server Error), 502 (Bad Gateway), 503 (Service Unavailable), 504 (Gateway Timeout)
142
+
143
+
**Custom Retry Configuration:**
144
+
145
+
You can customize the retry behavior by providing your own retry options:
146
+
147
+
```go
148
+
client:= sdk.NewClient(&sdk.ClientOptions{
149
+
BaseURL: "http://localhost:8080/v1",
150
+
RetryOptions: &sdk.RetryOptions{
151
+
MaxRetries: 5, // Maximum number of retry attempts
152
+
Timeout: time.Duration(60) * time.Second, // Timeout per request
153
+
MinDelay: time.Duration(1) * time.Second, // Minimum delay between retries
154
+
MaxDelay: time.Duration(30) * time.Second, // Maximum delay between retries
155
+
RetryableStatusCodes: []int{429, 500, 502, 503, 504}, // HTTP status codes to retry
156
+
},
157
+
})
158
+
```
159
+
160
+
**Exponential Backoff with Jitter:**
161
+
162
+
The retry mechanism uses exponential backoff with jitter to prevent thundering herd problems. The delay between retries is calculated as:
163
+
164
+
1. Base delay starts at `MinDelay` and doubles with each retry
165
+
2. Capped at `MaxDelay` to prevent excessive waiting
166
+
3. Random jitter (±25%) is added to spread out retry attempts
167
+
168
+
Example delay sequence (with 1s MinDelay, 30s MaxDelay):
169
+
- 1st retry: ~1s (0.75s - 1.25s with jitter)
170
+
- 2nd retry: ~2s (1.5s - 2.5s with jitter)
171
+
- 3rd retry: ~4s (3s - 5s with jitter)
172
+
- 4th retry: ~8s (6s - 10s with jitter)
173
+
- 5th retry: ~16s (12s - 20s with jitter)
174
+
175
+
**Disabling Retries:**
176
+
177
+
To disable automatic retries, set `MaxRetries` to 0:
178
+
179
+
```go
180
+
client:= sdk.NewClient(&sdk.ClientOptions{
181
+
BaseURL: "http://localhost:8080/v1",
182
+
RetryOptions: &sdk.RetryOptions{
183
+
MaxRetries: 0, // Disables retries
184
+
},
185
+
})
186
+
```
187
+
188
+
**Rate Limiting (429 Status):**
189
+
190
+
When the server returns a 429 (Too Many Requests) status code, the SDK will:
191
+
1. Check for a `Retry-After` header
192
+
2. If present, wait for the specified duration before retrying
193
+
3. If not present, use the standard exponential backoff strategy
194
+
195
+
**Context and Cancellation:**
196
+
197
+
Retries respect the context passed to API methods. If the context is cancelled or times out, retries will stop immediately:
The Inference Gateway supports various middleware layers (MCP tools, A2A agents) that can be bypassed for direct provider access. The SDK provides `WithMiddlewareOptions` to control middleware behavior:
0 commit comments