-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Graceful Termination: Cancelled Task is Failed (even with Retries)
#4136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
`Finally Tasks` are executed only when all `Tasks` are _Done_, where _Done_ means a `Task` is _Skipped_, _Succeeded_ or _Failed_. But a `Task` is considered _Failed_ only when its `Retries` are all done. Today, when a `Task` with `Retries` is `Cancelled`, it's not considered _Failed_ if its `Retries` are not yet done. This is problematic because if a `PipelineRun` is _Gracefully Cancelled_, where it should run `Finally Tasks`, and it has a running `Task` with `Retries`: - the `Task` with `Retries` be cancelled, but won't be considered _Failed_ because its `Retries` are not done - this would block the execution of `Finally Tasks` - the `PipelineRun` would hang (stay running) until it times out In this change, we modify `Is_Failure` to consider that either all `Retries` are done or `Reason` for failure is `Cancellation`. With this change, we can unblock execution of the `Finally Tasks` during _Graceful Termination_. In addition, the documentation notes that "`Retries` - Specifies the number of times to retry the execution of a Task after a failure. _Does not apply to execution cancellations_." Fixes #4125
|
@tektoncd/core-maintainers added this bug fix to the milestone for 0.27, please take a look :) |
|
|
||
| isDone := facts.checkTasksDone(d) | ||
| if d := cmp.Diff(isDone, tc.expected); d != "" { | ||
| if d := cmp.Diff(tc.expected, isDone); d != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 nice catch
|
/test pull-tekton-pipeline-integration-tests |
1 similar comment
|
/test pull-tekton-pipeline-integration-tests |
task with remaining retries as _failed_
task with remaining retries as _failed_task with remaining retries as *failed*
task with remaining retries as *failed*task with remaining retries as failed
task with remaining retries as failedtask is failed (even with remaining retries)
task is failed (even with remaining retries)task is failed (even with retries)
task is failed (even with retries)Task is Failed (even with Retries)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sbwsg, vdemeester The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Changes
Finally Tasksare executed only when allTasksare Done, whereDone means a
Taskis Skipped, Succeeded or Failed. But aTaskis considered Failed only when itsRetriesare all done.Today, when a
TaskwithRetriesisCancelled, it's not consideredFailed if its
Retriesare not yet done. This is problematic becauseif a
PipelineRunis Gracefully Cancelled, where it should runFinally Tasks, and it has a runningTaskwithRetries:TaskwithRetrieswill be cancelled, but won't be consideredFailed because its
Retriesare not doneFinally TasksPipelineRunwould hang (stay running) until it times outIn this change, we modify
Is_Failureto consider that either allRetriesare done orReasonfor failure isCancellation. With thischange, we can unblock execution of the
Finally Tasksduring GracefulTermination.
In addition, the documentation notes that "
Retries- Specifies thenumber of times to retry the execution of a Task after a failure.
Does not apply to execution cancellations."
Fixes #4125
Related:
/kind bug
Submitter Checklist
As the author of this PR, please check off the items in this checklist:
functionality, content, code)
Release Notes