Skip to content

Conversation

@vdemeester
Copy link
Member

@vdemeester vdemeester commented May 12, 2021

Changes

Since we "dereference" fetched definition and store them into the
status, we can use this as a source of truth. This allows us to
fetch only once the definitions (Task, Pipeline) and has the benefit
to make the runs (PipelineRun, TaskRun) immune to change to what they
refer, while the are executing.

Without this change, if you run a PipelineRun that reference a
Pipeline that is deleted during execution, your PipelineRun will fail
at some point because it cannot fetch the definition anymore, even if
it stored them in its status. This is the case for TaskRun too.

Fixes #3916

Signed-off-by: Vincent Demeester [email protected]

/kind bug

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Docs included if any changes are user facing
  • Tests included if any functionality added or changed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including
    functionality, content, code)
  • Release notes block below has been filled in or deleted (only if no user facing changes)

Release Notes

Only fetch the definition once, and then used the spec stored in the status as source of truth. 
This reduce the probable race condition when a `PipelineRun` or a `TaskRun` refers to a `Pipeline` or `Task` that changes during its execution.

@tekton-robot tekton-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. labels May 12, 2021
@tekton-robot tekton-robot requested review from dibyom and jerop May 12, 2021 11:03
@tekton-robot tekton-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 12, 2021
@vdemeester
Copy link
Member Author

This probably needs some more discussions (in the API WG or somewhere) and some e2e tests.

@vdemeester
Copy link
Member Author

/cc @tektoncd/core-maintainers

@tekton-robot tekton-robot requested a review from a team May 12, 2021 11:05
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/pipelinerun/resources/pipelineref.go 66.7% 63.0% -3.7
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 92.1% 91.8% -0.3
pkg/reconciler/taskrun/resources/taskref.go 75.8% 62.5% -13.3

@vdemeester
Copy link
Member Author

One thing to discuss here too is related to this comment : should this be behind a configuration ? (and if yes, what should be the default behavior)

@ghost
Copy link

ghost commented May 12, 2021

This looks good to me! Agree it needs testing but otherwise I'm happy to see this move forward.

I don't think this specific PR should be behind a flag. From my pov this behaviour should simply be the way it works: if the specs are cached then we use them. A configuration option to turn off spec caching is then a separate feature request.

@vdemeester vdemeester force-pushed the 3916-fetch-only-when-needed branch from 2d7a276 to ac6ee04 Compare May 17, 2021 14:45
@tekton-robot tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 17, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/pipelinerun/resources/pipelineref.go 66.7% 63.0% -3.7
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 92.1% 91.8% -0.3

@vdemeester vdemeester changed the title wip: Only fetch the definitions once 🧙 Only fetch the definitions once 🧙 May 17, 2021
@tekton-robot tekton-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 17, 2021
@vdemeester vdemeester force-pushed the 3916-fetch-only-when-needed branch from ac6ee04 to 8f9bce9 Compare May 17, 2021 15:20
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/pipelinerun/resources/pipelineref.go 66.7% 63.0% -3.7
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 92.1% 91.8% -0.3
pkg/reconciler/taskrun/resources/taskref.go 75.8% 61.0% -14.8

@ghost
Copy link

ghost commented May 17, 2021

/test pull-tekton-pipeline-integration-tests

@ghost
Copy link

ghost commented May 17, 2021

/test pull-tekton-pipeline-alpha-integration-tests


// TestPipelineRunRefDeleted tests that a running PipelineRun doesn't fail when the Pipeline
// it references is deleted.
func TestPipelineRunRefDeletede(t *testing.T) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Deletede -> Deleted

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbwsg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 18, 2021
@pritidesai pritidesai added this to the Pipelines v0.25 milestone Jun 2, 2021
Spec: *pipelineRun.Status.PipelineSpec,
}, nil
}, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we please add a unit test for this? 🥺

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this function is implemented to fetch the definition from the cluster or bundle reference but we are checking if the pipeline is already running i.e. status is not nil, return the stored specifications irrespective of bundle or no bundle.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 😉
Yes, the pipeline doesn't really have to run though, just the pipelineSpec in status being populated

@pritidesai
Copy link
Member

pritidesai commented Jun 3, 2021

thanks @vdemeester the changes look great and if I can recollect the specifications are stored in the status after resolving params so they are ready to use as is after fetching. One request, it will be great to see a unit test for changes in pipelineRef.go

Also, we might have to apply similar changes to run in addition to taskRun, not necessarily in this PR, can be a separate effort.

Edit: Or rather the custom controller is responsible for the implementation 🤔

@vdemeester vdemeester force-pushed the 3916-fetch-only-when-needed branch from 8f9bce9 to 016c1f6 Compare June 4, 2021 12:24
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/pipelinerun/resources/pipelineref.go 66.7% 70.4% 3.7
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 92.2% 91.9% -0.3
pkg/reconciler/taskrun/resources/taskref.go 75.8% 61.0% -14.8

@vdemeester vdemeester force-pushed the 3916-fetch-only-when-needed branch from 016c1f6 to 0733392 Compare June 4, 2021 12:47
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/pipelinerun/resources/pipelineref.go 66.7% 70.4% 3.7
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 92.2% 91.9% -0.3
pkg/reconciler/taskrun/resources/taskref.go 75.8% 68.3% -7.5

Since we "dereference" fetched definition and store them into the
`status`, we can use this as a source of truth. This allows us to
fetch only once the definitions (Task, Pipeline) *and* has the benefit
to make the runs (PipelineRun, TaskRun) immune to change to what they
refer, while the are executing.

Without this change, if you run a PipelineRun that reference a
Pipeline that is deleted during execution, your PipelineRun will fail
at some point because it cannot fetch the definition anymore, even if
it stored them in its status. This is the case for TaskRun too.

Signed-off-by: Vincent Demeester <[email protected]>
@vdemeester vdemeester force-pushed the 3916-fetch-only-when-needed branch from 0733392 to 750cb2e Compare June 4, 2021 12:52
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/pipelinerun/resources/pipelineref.go 66.7% 70.4% 3.7
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 92.2% 91.9% -0.3
pkg/reconciler/taskrun/resources/taskref.go 75.8% 68.3% -7.5

@pritidesai
Copy link
Member

thanks @vdemeester

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 4, 2021
@pritidesai
Copy link
Member

/test pull-tekton-pipeline-integration-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Run vs Definition "race" and "source of truth": PipelineRun errors out after Pipeline changed

3 participants