-
Notifications
You must be signed in to change notification settings - Fork 107
Description
tfw_sched_get_srv_conn()
call in tfw_http_req_cache_cb()
assumes that it's quite improbable that there is no connection to any of the backends, so if all the backend simultaneously reset connections Tempesta returns 502 response code. However, the recent tests show that the state is usual and must be correctly handled.
Firstly, Tempesta FW must use the same rescheduling mechanism, with requests eviction by the timeout, as for rescheduled requests that were ever sent to a serer. Secondly, a /proc/tempesta/servers/%group_name%/perfstat
counter must be introduced.
Last, current rescheduling mechanism loops in trying a new server/connections for each request to be rescheduled, however at the point it knows precisely that there are no live connections. Instead it should wait until a request timer elapses or a new server connection established. In the first case a request must be deleted. In the second the connection should be tried, but gracefully, not for all the pending requests at once.
Somewhat linked with #687 since the message queues must be adjusted.
Maybe makes sense to implement #1454 before to facilitate testing and debugging of the current task.
Probably the problem can be solved with dynamic allocation a new server connection if all the current connections are busy #710 and handle the current static number of connections as a minimal provision.