Skip to content

tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting #3520

@shaxiaozz

Description

@shaxiaozz

What I encountered

I use simple_httpclient in tornado 6.4.2 to complete asynchronous HTTP requests (max_tries=5, max_time=15), but I found a lot of logs in the log: tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting. There are two situations:

When the traffic is stable and low peak: qps is about 100
max_tries=5, max_time=15 can make my function execute completely, but a lot of such logs will appear, and finally the retry is successful

Backing off my_funcation_async(...) for 1.8s (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
Backing off my_funcation_async(...) for 0s (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
Backing off my_funcation_async(...) for 0.8s (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)

During peak traffic: qps about 1k
max_tries=5, max_time=15 can no longer help me complete the function execution. A large number of such logs appear, and finally the tornado-server api reports a 500 error

tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting
500 POST /api/my-funcation (10.224.25.105) 24068.75ms
[E 250709 23:05:59 _common:120] Giving up my_funcation_async(...) after 4 tries (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
[I 250709 23:05:59 _common:105] Backing off my_funcation_async(...) for 2.8s (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
[I 250709 23:05:59 _common:105] Backing off my_funcation_async(...) for 1.6s (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/backoff/_async.py", line 151, in retry
    ret = await target(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "my_funcation_async.py", line 203, in my_funcation_async
    response = await http_client0.fetch(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting

My environment

$ python -V
Python 3.11.12

$ pip show tornado
Name: tornado
Version: 6.4.2
Summary: Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
Home-page: http://www.tornadoweb.org/
Author: Facebook
Author-email: [email protected]
License: Apache-2.0
Location: /usr/local/lib/python3.11/site-packages
Requires: 
Required-by: 

My sample code

import base64
import json
import time

import requests
import backoff
from tornado.httpclient import HTTPError

from tornado.httpclient import AsyncHTTPClient, HTTPRequest
from tornado.simple_httpclient import SimpleAsyncHTTPClient

AsyncHTTPClient.configure(
    SimpleAsyncHTTPClient, 
    max_clients=10240,
    defaults=dict(
        connect_timeout=5,
        request_timeout=10,
        headers={
            'Connection': 'keep-alive',
            'Keep-Alive': 'timeout=60, max=1000'
        },
        max_buffer_size=104857600
    )
)

http_client0 = AsyncHTTPClient()


@backoff.on_exception(backoff.expo, HTTPError, max_tries=5, max_time=15)
async def my_funcation_async(name, key):
    global http_client0
    url = 'http://my-api.com/my/api'
    params = {
        'name': base64.b64encode(src).decode(),
        'key': base64.b64encode(key).decode()
    }
    body = json.dumps(params)
    request = HTTPRequest(
        url,
        method='POST',
        headers={'Content-Type': 'application/json'},
        body=body
    )
    response = await http_client0.fetch(request)
    j = json.loads(response.body)
    dst = base64.b64decode(j['dst'])
    print(f"encode_speck128_async src.hex={src.hex()} key.hex={key.hex()} dst.hex={dst.hex()}")
    return dst

My thoughts on investigation

I searched for relevant keywords in the issues list and found these issues:
#2825
#2697

I found that when Tornado appears Timeout while connecting, there are only a few reasons: connect_timeout, DNS timeout, bad DNS resolve, connect error

Since this problem occurs all the time in my environment, and it seems to be related to an error in establishing a TCP connection, I ran a tcpdump to try to see some problems in the data packets.

Note: This data packet was captured during low traffic hours, which means that this situation was completed by retrying later.

Source IP: 10.242.139.152
Destination IP: 192.168.21.104

Source is tornado
Destination is tornado asynchronous request HTTP-Server
Image

I found from the data packet interaction that after completing the TCP three-way handshake, Tornado immediately sent FIN+ACK to the server. I don't understand why Tornado does this. I think this is the problem.

This also explains why from the log, Backing off my_funcation_async(...) for 1.8s (tornado.simple_httpclient.HTTPTimeoutError: Timeout while connecting)
This time does not exceed the connect_timeout=5 I set, but it is Timeout while connecting, and this time is uneven, sometimes 0s, sometimes 0.2s.

I'm curious, what strategy or something did it hit that made Tornado behave like this???

Looking forward to your reply

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions