-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Description
What I encountered
I use simple_httpclient in tornado 6.4.2 to complete asynchronous HTTP requests (max_tries=5, max_time=15), but I found a lot of logs in the log: tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting. There are two situations:
When the traffic is stable and low peak: qps is about 100
max_tries=5, max_time=15 can make my function execute completely, but a lot of such logs will appear, and finally the retry is successful
Backing off my_funcation_async(...) for 1.8s (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
Backing off my_funcation_async(...) for 0s (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
Backing off my_funcation_async(...) for 0.8s (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
During peak traffic: qps about 1k
max_tries=5, max_time=15 can no longer help me complete the function execution. A large number of such logs appear, and finally the tornado-server api reports a 500 error
tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting
500 POST /api/my-funcation (10.224.25.105) 24068.75ms
[E 250709 23:05:59 _common:120] Giving up my_funcation_async(...) after 4 tries (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
[I 250709 23:05:59 _common:105] Backing off my_funcation_async(...) for 2.8s (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
[I 250709 23:05:59 _common:105] Backing off my_funcation_async(...) for 1.6s (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/backoff/_async.py", line 151, in retry
ret = await target(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "my_funcation_async.py", line 203, in my_funcation_async
response = await http_client0.fetch(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting
My environment
$ python -V
Python 3.11.12
$ pip show tornado
Name: tornado
Version: 6.4.2
Summary: Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
Home-page: http://www.tornadoweb.org/
Author: Facebook
Author-email: [email protected]
License: Apache-2.0
Location: /usr/local/lib/python3.11/site-packages
Requires:
Required-by:
My sample code
import base64
import json
import time
import requests
import backoff
from tornado.httpclient import HTTPError
from tornado.httpclient import AsyncHTTPClient, HTTPRequest
from tornado.simple_httpclient import SimpleAsyncHTTPClient
AsyncHTTPClient.configure(
SimpleAsyncHTTPClient,
max_clients=10240,
defaults=dict(
connect_timeout=5,
request_timeout=10,
headers={
'Connection': 'keep-alive',
'Keep-Alive': 'timeout=60, max=1000'
},
max_buffer_size=104857600
)
)
http_client0 = AsyncHTTPClient()
@backoff.on_exception(backoff.expo, HTTPError, max_tries=5, max_time=15)
async def my_funcation_async(name, key):
global http_client0
url = 'http://my-api.com/my/api'
params = {
'name': base64.b64encode(src).decode(),
'key': base64.b64encode(key).decode()
}
body = json.dumps(params)
request = HTTPRequest(
url,
method='POST',
headers={'Content-Type': 'application/json'},
body=body
)
response = await http_client0.fetch(request)
j = json.loads(response.body)
dst = base64.b64decode(j['dst'])
print(f"encode_speck128_async src.hex={src.hex()} key.hex={key.hex()} dst.hex={dst.hex()}")
return dst
My thoughts on investigation
I searched for relevant keywords in the issues list and found these issues:
#2825
#2697
I found that when Tornado appears Timeout while connecting, there are only a few reasons: connect_timeout, DNS timeout, bad DNS resolve, connect error
Since this problem occurs all the time in my environment, and it seems to be related to an error in establishing a TCP connection, I ran a tcpdump to try to see some problems in the data packets.
Note: This data packet was captured during low traffic hours, which means that this situation was completed by retrying later.
Source IP: 10.242.139.152
Destination IP: 192.168.21.104
Source is tornado
Destination is tornado asynchronous request HTTP-Server
I found from the data packet interaction that after completing the TCP three-way handshake, Tornado immediately sent FIN+ACK to the server. I don't understand why Tornado does this. I think this is the problem.
This also explains why from the log, Backing off my_funcation_async(...) for 1.8s (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
This time does not exceed the connect_timeout=5 I set, but it is Timeout while connecting, and this time is uneven, sometimes 0s, sometimes 0.2s.
I'm curious, what strategy or something did it hit that made Tornado behave like this???
Looking forward to your reply