-
Notifications
You must be signed in to change notification settings - Fork 27
Description
While #79 described a situation that could happen commonly, there are some less likely but still possible race conditions that we might consider guarding against.
Concurrent downscale + rollout
If a downscale is in progress but hasn't updated the last-downscale
annotation yet, it's possible that a rollout could start concurrently, bypassing the min-time-between-zones-downscale
check in the controller.
Concurrent downscale + downscale
If downscales for two zones occur concurrently, they can both be accepted before either has updated the last-downscale
annotation, bypassing the min-time-between-zones-downscale
check in the prepare-downscale webhook.
Potential fix
One reliable way to serialize operations across different statefulsets is to lock on a common resource, such as with an annotation on a common CRD, using a resourceVersion on updates. Absent that, we could add a lock annotation to a statefulset and use a double check (check, lock, check) when adding it to guard against races, though I'm not positive this is completely safe.