gunicorn: add ExpansionCooldownEventletWorker
#1200
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This sits on top of #1196, so that needs review first.This custom worker class attempts to throttle runaway greenthread creation that can happen in a process when resource contention is preventing existing requests from being completed. This runaway thread creation can result in a lot of expensive new per-thread resources being constructed, propagating the runaway.
Rather than allowing creation of up to
worker_connections
greenthreads at any time, this worker starts with aGreenPool
size ofinitial_worker_connections
(1 by default). When all these greenthreads are being used and another connection is waiting to be accepted, it will only expand the size of theGreenPool
if it has been at leastworker_connections_expansion_cooldown_seconds
since it last did this (the default value ofworker_connections_expansion_cooldown_seconds
is 0s to allow use of this class without it having any real effect until explicitly configured with a value).This is achieved by simply ("simply") modifying the worker's connection-accepting loop to add sections before and after the call to
sock.accept()
which, beforehand, will wait either until an existing "slot" in theGreenPool
has become available for use or untilworker_connections_expansion_cooldown_seconds
since the last expansion have expired. Once a new connection has been accepted (possibly several seconds later - there's no certainty we even have another connection waiting yet), we re-check whether we would still need to expandGreenPool
to handle this new connection (after all, an existing greenthread slot could have become available while we were waiting for a new connection) and if we do, do so before continuing to pass the request handling over to theGreenPool
'spool.spawn(...)
, which should in all circumstances now have at least one empty greenthread slot to use for it.Keep in mind that this is a shared socket we're accepting connections from, so during this cooldown period, the connection can happily be picked up by another process that hopefully does already have a spare greenthread slot.
There's a bonus feature here exposed by
worker_connections_expansion_min_wait_seconds
which would ensure all thread pool expansions only happen after a small wait, no matter how long it has been since the previous expansion. The idea is this could be used to bias new connections towards being picked up by processes that don't need to expand theirGreenPool
to handle new connections, but I'm not really expecting to play with this knob in practise and will probably just leave it at 0s.Oh, and this also emits a nice log message when expanding the thread pool to give us better observability over its behaviour.
The greatest ugliness comes from its need to monkeypatch gunicorn's
geventlet._eventlet_serve
module-level function, because that's the only way upstream exposes it. This might be a problem if someone were to try and run two different types of gunicorn worker at the same time, but I don't think that's even possible. Separately I might propose an upstream patch that pulls_eventlet_serve
in-class to make this less nasty to override.Also included in this PR is a config variable that allows this worker class's sibling
ContextRecyclingEventletWorker
to have its behaviour enabled/disabled, so we can trivially use both classes together viaNotifyEventletWorker
and only "enable" the features we want.