Skip to content

Conversation

risicle
Copy link
Member

@risicle risicle commented Mar 1, 2025

This sits on top of #1196, so that needs review first.

This custom worker class attempts to throttle runaway greenthread creation that can happen in a process when resource contention is preventing existing requests from being completed. This runaway thread creation can result in a lot of expensive new per-thread resources being constructed, propagating the runaway.

Rather than allowing creation of up to worker_connections greenthreads at any time, this worker starts with a GreenPool size of initial_worker_connections (1 by default). When all these greenthreads are being used and another connection is waiting to be accepted, it will only expand the size of the GreenPool if it has been at least worker_connections_expansion_cooldown_seconds since it last did this (the default value of worker_connections_expansion_cooldown_seconds is 0s to allow use of this class without it having any real effect until explicitly configured with a value).

This is achieved by simply ("simply") modifying the worker's connection-accepting loop to add sections before and after the call to sock.accept() which, beforehand, will wait either until an existing "slot" in the GreenPool has become available for use or until worker_connections_expansion_cooldown_seconds since the last expansion have expired. Once a new connection has been accepted (possibly several seconds later - there's no certainty we even have another connection waiting yet), we re-check whether we would still need to expand GreenPool to handle this new connection (after all, an existing greenthread slot could have become available while we were waiting for a new connection) and if we do, do so before continuing to pass the request handling over to the GreenPool's pool.spawn(...), which should in all circumstances now have at least one empty greenthread slot to use for it.

Keep in mind that this is a shared socket we're accepting connections from, so during this cooldown period, the connection can happily be picked up by another process that hopefully does already have a spare greenthread slot.

There's a bonus feature here exposed by worker_connections_expansion_min_wait_seconds which would ensure all thread pool expansions only happen after a small wait, no matter how long it has been since the previous expansion. The idea is this could be used to bias new connections towards being picked up by processes that don't need to expand their GreenPool to handle new connections, but I'm not really expecting to play with this knob in practise and will probably just leave it at 0s.

Oh, and this also emits a nice log message when expanding the thread pool to give us better observability over its behaviour.

The greatest ugliness comes from its need to monkeypatch gunicorn's geventlet._eventlet_serve module-level function, because that's the only way upstream exposes it. This might be a problem if someone were to try and run two different types of gunicorn worker at the same time, but I don't think that's even possible. Separately I might propose an upstream patch that pulls _eventlet_serve in-class to make this less nasty to override.

Also included in this PR is a config variable that allows this worker class's sibling ContextRecyclingEventletWorker to have its behaviour enabled/disabled, so we can trivially use both classes together via NotifyEventletWorker and only "enable" the features we want.

@risicle risicle force-pushed the ris-gunicorn-greenthread-creation-throttle branch 4 times, most recently from e2a62a0 to 2fd07ef Compare March 5, 2025 15:55
@risicle risicle changed the title eventlet: add PoolExpansionCooldownEventletWorker gunicorn: add ExpansionCooldownEventletWorker Mar 5, 2025
@risicle risicle marked this pull request as ready for review March 5, 2025 16:47
@risicle risicle force-pushed the ris-gunicorn-greenthread-creation-throttle branch from 2fd07ef to 501ef53 Compare March 7, 2025 13:28
risicle added 4 commits March 7, 2025 13:28
config variables are pulled directly from env here because adding
custom gunicorn config variables is nontrivial if we're launching
via the `gunicorn` executable rather than handling our own app
launch process
config variables are pulled directly from env here because adding
custom gunicorn config variables is nontrivial if we're launching
via the `gunicorn` executable rather than handling our own app
launch process
for enabling an app to make use of all our custom gunicorn
worker class features together
@risicle risicle force-pushed the ris-gunicorn-greenthread-creation-throttle branch from 501ef53 to cf99e34 Compare March 7, 2025 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant