Skip to content

Cannon: Drain websockets in a controlled fashion on SIGTERM or SIGINT #2416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
May 20, 2022

Conversation

akshaymankar
Copy link
Member

@akshaymankar akshaymankar commented May 19, 2022

The plan to implement this is described internally here: https://wearezeta.atlassian.net/wiki/spaces/PS/pages/585564424/How+to+gracefully+drain+cannon+but+not+so+slowly

Related PR: https://github.com/zinfra/cailleach/pull/1092

Checklist

  • The PR Title explains the impact of the change.
  • The PR description provides context as to why the change should occur and what the code contributes to that effect. This could also be a link to a JIRA ticket or a Github issue, if there is one.
  • changelog.d contains the following bits of information (details):
    • A file with the changelog entry in one or more suitable sub-sections. The sub-sections are marked by directories inside changelog.d.
    • If new config options introduced: added usage description under docs/reference/config-options.md
    • If new config options introduced: recommended measures to be taken by on-premise instance operators.

@akshaymankar akshaymankar temporarily deployed to cachix May 19, 2022 10:38 Inactive
@jschaul jschaul temporarily deployed to cachix May 19, 2022 11:02 Inactive
@akshaymankar akshaymankar changed the title Cannon: Implement draining, not hooked to signals yet. Cannon: Implement draining May 19, 2022
@akshaymankar akshaymankar temporarily deployed to cachix May 19, 2022 13:00 Inactive
Do not timeout the actual drain, a subsequent SIGKILL will do that anyway.
@akshaymankar akshaymankar temporarily deployed to cachix May 19, 2022 13:29 Inactive
@akshaymankar akshaymankar temporarily deployed to cachix May 19, 2022 13:49 Inactive
@akshaymankar akshaymankar temporarily deployed to cachix May 19, 2022 14:29 Inactive
@akshaymankar akshaymankar temporarily deployed to cachix May 19, 2022 14:45 Inactive
@akshaymankar akshaymankar temporarily deployed to cachix May 19, 2022 14:53 Inactive
@akshaymankar akshaymankar changed the title Cannon: Implement draining Cannon: Drain websockets in a controlled fashion on SIGTERM or SIGINT May 19, 2022
@jschaul jschaul temporarily deployed to cachix May 20, 2022 00:23 Inactive
@jschaul jschaul marked this pull request as ready for review May 20, 2022 00:23
@jschaul jschaul merged commit 48cc7a6 into develop May 20, 2022
@jschaul jschaul deleted the akshaymankar/cannon-drain branch May 20, 2022 00:25
jschaul added a commit that referenced this pull request May 25, 2022
By default, incoming network traffic for websockets comes through these network
hops:

Internet -> LoadBalancer -> kube-proxy -> nginx-ingress-controller -> nginz -> cannon

In order to have graceful draining of websockets when something gets restarted (as implemented in #2416 ), as it is not easily possible to implement the graceful draining on nginx-ingress-controller or nginz by itself, with this PR there is now
a configuration option to get the following network hops:

Internet -> separate LoadBalancer for cannon only -> kube-proxy -> [nginz->cannon (2 containers in the same pod)]

More context:
https://wearezeta.atlassian.net/wiki/spaces/PS/pages/585564424/How+to+gracefully+drain+cannon+but+not+so+slowly

FUTUREWORK: this introduces some nginz config duplication; some way to refactor this (e.g. by moving charts/{cannon, nginz}/* to charts/wire-server/ in a backwards-compatible way) would allow to reduce this duplication.

Co-authored-by: jschaul <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants