Skip to content

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented May 23, 2025

This PR adds a new ClientResponse.discard_content() method to help middleware properly handle connection reuse when retrying requests. The method reads and discards the response body, allowing the connection to be released back to the pool for reuse.

When client middleware performs retries based on response status codes, connections may not be reused if the response body hasn't been consumed. This is because aiohttp only releases connections back to the pool after:

  1. The response body is fully read (EOF reached)
  2. The response context manager exits
  3. response.release() is explicitly called

This behavior caused flaky tests where middleware retries would create new connections instead of reusing existing ones.

The new discard_content() method provides a clean API for middleware to consume response bodies without processing them:

async def discard_content(self) -> None:
    """Read and discard the response body to allow connection reuse."""
    if self._released:
        return

    # Read and discard content until EOF
    while True:
        chunk = await self.content.readany()
        if not chunk:
            break

When discard_content() is needed:

  • Only required if the response has a body that you're not going to read
  • Most small responses (where headers and body arrive in the same packet) will automatically release connections
  • Empty responses (like 204 No Content) automatically release connections without any action needed

When to use it:

  • In retry middleware that checks status codes before deciding to retry
  • When you only need response headers and want to ensure connection reuse
  • To be 100% certain the connection gets a chance to be reused (though reuse is never guaranteed due to other factors like connection limits, timeouts, or server behavior)
  1. Added ClientResponse.discard_content() method - Efficiently reads and discards response body using readany()

  2. Updated all client middleware to use discard_content():

    • aiohttp/client_middleware_digest_auth.py - Digest auth middleware
    • examples/retry_middleware.py - Retry middleware example
    • examples/combined_middleware.py - Combined middleware example
    • examples/token_refresh_middleware.py - Token refresh example
    • Documentation examples in docs/client_middleware_cookbook.rst and docs/client_advanced.rst
  3. Added comprehensive tests:

    • Test for responses with bodies (requires calling discard_content())
    • Test for empty responses (204 No Content - automatic connection release)
    • Both tests verify only 1 connection is created during retries
  4. Added documentation explaining when and why to use discard_content()

The fix resolves the flaky test test_client_middleware_retry_reuses_connection that was intermittently failing due to connections not being reused during middleware retries.

…iddleware

This PR adds a new `ClientResponse.discard_content()` method to help middleware properly handle connection reuse when retrying requests. The method reads and discards the response body, allowing the connection to be released back to the pool for reuse.

When client middleware performs retries based on response status codes, connections may not be reused if the response body hasn't been consumed. This is because aiohttp only releases connections back to the pool after:
1. The response body is fully read (EOF reached)
2. The response context manager exits
3. `response.release()` is explicitly called

This behavior caused flaky tests where middleware retries would create new connections instead of reusing existing ones.

The new `discard_content()` method provides a clean API for middleware to consume response bodies without processing them:

```python
async def discard_content(self) -> None:
    """Read and discard the response body to allow connection reuse."""
    if self._released:
        return

    # Read and discard content until EOF
    while True:
        chunk = await self.content.readany()
        if not chunk:
            break
```

**When `discard_content()` is needed:**
- Only required if the response has a body that you're not going to read
- Most small responses (where headers and body arrive in the same packet) will automatically release connections
- Empty responses (like 204 No Content) automatically release connections without any action needed

**When to use it:**
- In retry middleware that checks status codes before deciding to retry
- When you only need response headers and want to ensure connection reuse
- To be 100% certain the connection gets a chance to be reused (though reuse is never guaranteed due to other factors like connection limits, timeouts, or server behavior)

1. **Added `ClientResponse.discard_content()` method** - Efficiently reads and discards response body using `readany()`

2. **Updated all client middleware to use `discard_content()`:**
   - `aiohttp/client_middleware_digest_auth.py` - Digest auth middleware
   - `examples/retry_middleware.py` - Retry middleware example
   - `examples/combined_middleware.py` - Combined middleware example
   - `examples/token_refresh_middleware.py` - Token refresh example
   - Documentation examples in `docs/client_middleware_cookbook.rst` and `docs/client_advanced.rst`

3. **Added comprehensive tests:**
   - Test for responses with bodies (requires calling `discard_content()`)
   - Test for empty responses (204 No Content - automatic connection release)
   - Both tests verify only 1 connection is created during retries

4. **Added documentation** explaining when and why to use `discard_content()`

The fix resolves the flaky test `test_client_middleware_retry_reuses_connection` that was intermittently failing due to connections not being reused during middleware retries.
@bdraco bdraco added this to the 3.12 milestone May 23, 2025
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label May 23, 2025
Copy link

codspeed-hq bot commented May 23, 2025

CodSpeed Performance Report

Merging #10976 will not alter performance

Comparing discard_content (6d4de2d) with master (48f5324)

Summary

✅ 59 untouched benchmarks

Copy link

codecov bot commented May 23, 2025

Codecov Report

Attention: Patch coverage is 81.81818% with 16 lines in your changes missing coverage. Please review.

Project coverage is 98.70%. Comparing base (1878509) to head (021c202).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
aiohttp/client_reqrep.py 48.38% 12 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10976      +/-   ##
==========================================
+ Coverage   98.10%   98.70%   +0.60%     
==========================================
  Files         129      129              
  Lines       39541    39635      +94     
  Branches     2185     2195      +10     
==========================================
+ Hits        38792    39123     +331     
+ Misses        591      357     -234     
+ Partials      158      155       -3     
Flag Coverage Δ
CI-GHA 98.59% <80.68%> (+0.59%) ⬆️
OS-Linux 98.29% <80.68%> (+0.60%) ⬆️
OS-Windows 96.49% <80.68%> (+0.49%) ⬆️
OS-macOS 97.47% <80.68%> (-0.05%) ⬇️
Py-3.10.11 97.36% <79.54%> (+0.25%) ⬆️
Py-3.10.17 97.72% <79.54%> (+0.41%) ⬆️
Py-3.11.12 97.97% <79.54%> (?)
Py-3.11.9 97.44% <79.54%> (+0.25%) ⬆️
Py-3.12.10 97.74% <79.54%> (+0.26%) ⬆️
Py-3.13.3 98.37% <79.54%> (+1.06%) ⬆️
Py-3.9.13 97.24% <79.31%> (-0.04%) ⬇️
Py-3.9.22 97.76% <79.31%> (+0.62%) ⬆️
VM-macos 97.47% <80.68%> (-0.05%) ⬇️
VM-ubuntu 98.29% <80.68%> (+0.60%) ⬆️
VM-windows 96.49% <80.68%> (+0.49%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 153 to 154
# Consume response body before retrying
await response.discard_content()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't do it here, as this middleware knows nothing about the request. If it's a streaming response, this is just an infinite loop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do it at all then as it gets very complex to figure out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can make discard_content only discard content if it has a content length.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I put the warning back and tell them that connections may not be reused if there is a response body and its advisable to avoid using middleware in this case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure how to handle this case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do it at all then as it gets very complex to figure out.

Yeah, I think just adding this code into middlewares that make sense individually may be better, rather than making this a public feature. I suspect the digest auth is fine to do this in, but most/all of the retry middlewares are probably not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can make discard_content only discard content if it has a content length.

That still has the issue of how much content should we allow? If it's got a length of 100GB, probably going to be a lot better to close the connection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can make discard_content() safe... more complexity but safe

Comment on lines +270 to +271
# Consume response body before retrying
await response.discard_content()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this one too, though less likely to cause a problem.

Comment on lines 149 to 150
# Discard response content to release connection for reuse
await response.discard_content()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one probably unsafe too.

Comment on lines 80 to 81
# Discard response content to release connection for reuse
await response.discard_content()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this one.

await self._wait_released() # Underlying connection released
return self._body

async def discard_content(self) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking the potential for using this in the wrong circumstance may mean it's better not to put it in as a public API? Maybe just manually do this in a couple of places that should be optimised for this case?


# Read and discard content with timeout
try:
async with async_timeout.timeout(timeout):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If timeout is >100ms, then isn't creating a new connection going to be faster in most cases?

Also, this is used in so many different situations, I'm not sure I'd be comfortable with a size limit >4KB; a slow connection could take a long time.

I'm still a bit skeptical on this, as it feels like a minor optimisation that may actually lead to slower performance for many users. It seems too difficult to get it right. e.g. Another case could be a misbehaving server which never actually sends the body (or is waiting for a response from the client first or something), so it will always hit the timeout here and never reach the content length above..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to just take it out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to abandon this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also thinking overnight, what is the problem we're trying to solve? For something like the auth middleware, my expectation is that a sensible service sending a 401 will have a body <1KB. If the body is that small, then it seems highly likely that it has already been read and the connection released, while it was loading the headers.

If the body is more than that, then it may well be more efficient to just close the connection anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about single threaded IOT devices that have a high cost of making a new connection. I think in those cases it's still ok though. The test passes most of the time with a small body. aiohttp sends the headers and body in two separate writes which isn't the case for the use case I'm worried about where everything will be in a single packet for small payloads. It would nice if we could get aiohttp to behave the same way for small payloads as I know we have downstream users implementing custom writes to get that to work with some devices

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it, it probably makes more sense to try to implement writing in a single syscall for small payloads as it makes this problem go away and fixes the other compatibility issue above. That's probably far less complex, and win all around.

@bdraco bdraco closed this May 23, 2025
@bdraco bdraco deleted the discard_content branch May 23, 2025 18:44
@bdraco bdraco removed this from the 3.12 milestone May 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot:chronographer:provided There is a change note present in this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants