-
Notifications
You must be signed in to change notification settings - Fork 750
Only enqueue workflows if flytepropeller is leader #6436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Jason Parraga <[email protected]>
Code Review Agent Run Status
|
Signed-off-by: Jason Parraga <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6436 +/- ##
=======================================
Coverage 58.47% 58.48%
=======================================
Files 940 940
Lines 71579 71585 +6
=======================================
+ Hits 41858 41864 +6
Misses 26538 26538
Partials 3183 3183
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Code Review Agent Run #99bf3dActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Changelist by BitoThis pull request implements the following key changes.
|
The end2end test is failing but I've tested this in a development environment without issues hmm. |
Signed-off-by: Jason Parraga <[email protected]>
Code Review Agent Run #a53844Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
return | ||
} | ||
key := wf.GetK8sWorkflowID() | ||
if !c.leader.Load() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the leader elector is not configured, it seems like c.leader
will never be set to true
, and this part will ignore all workflows and never enqueue them
|
||
<-backgroundCtx.Done() | ||
|
||
logger.Infof(ctx, "Lost leader lease.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to set c.leader
to False
here to prevent it from keep enqueuing workflow while it is not leader anymore?
@machichima this one wasn't ready for review yet since I was trying to work out the build failure. I'll circle back later. |
Tracking issue
Closes #6435
Why are the changes needed?
Without these changes followers/replicas in highly available deployments will enqueue workflows until they saturate the work queue. Once the work queue is saturated these workflows may become very stale. Once a follower does become a leader it will need to churn through these stale workflows that might not even exist anymore.
What changes were proposed in this pull request?
Added a thread safe bool for whether the propeller is leader and check it before enqueueing workflows.
How was this patch tested?
Not yet but will test in our development environment.
Check all the applicable boxes
Summary by Bito
This PR implements leader-only workflow enqueuing in HA flytepropeller deployments using an atomic boolean flag, preventing followers from adding stale workflows to the queue. It enhances system reliability and consistency during leadership transitions, and refines workflow handling. The changes also improve logging of workflow phases for failed executions to facilitate better monitoring and debugging.Unit tests added: False
Estimated effort to review (1-5, lower is better): 1