Skip to content

Conversation

mattmoor
Copy link
Member

@mattmoor mattmoor commented Aug 31, 2021

It's been bugging me that NewController methods for the TaskRun and PipelineRun controllers launch a go routine to harvest metrics, and it occurred to me that there might be a better way.

Borrowing from the CloudEvent client's use of hand-crafted injection logic:

This change takes a similar approach, creating pkg/{task,pipeline}runmetrics packages, which surface their *Recorder via .Get(ctx) methods. Rather than the controller spinning off go routine, this piggybacks on informer injection to "Start()" that process after the dependent informers have been started.

/kind cleanup

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Docs included if any changes are user facing
  • Tests included if any functionality added or changed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including
    functionality, content, code)
  • Release notes block below has been filled in or deleted (only if no user facing changes)

Release Notes

NONE

@tekton-robot tekton-robot added release-note-none Denotes a PR that doesnt merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels Aug 31, 2021
@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 31, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pipelinerunmetrics/injection.go Do not exist 18.2%
pkg/pipelinerunmetrics/metrics.go Do not exist 75.4%
pkg/reconciler/taskrun/controller.go 95.8% 95.0% -0.8
pkg/taskrunmetrics/injection.go Do not exist 18.2%
pkg/taskrunmetrics/metrics.go Do not exist 81.7%

It's been bugging me that `NewController` methods for the `TaskRun` and `PipelineRun` controllers launch a go routine to harvest metrics, and it occurred to me that there might be a better way.

Borrowing from the CloudEvent client's use of hand-crafted injection logic: https://github.com/tektoncd/pipeline/blob/7297c48d26da98552be4ee3c50d94a130bd8e79d/pkg/reconciler/events/cloudevent/cloudeventclient.go#L29

This change takes a similar approach, creating `pkg/{task,pipeline}runmetrics` packages, which surface their `*Recorder` via `.Get(ctx)` methods.  Rather than the controller spinning off go routine, this piggybacks on informer injection to "Start()" that process after the dependent informers have been started.
@mattmoor mattmoor force-pushed the metrics-fake-injection branch from 8c5c126 to b774a42 Compare August 31, 2021 23:31
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pipelinerunmetrics/injection.go Do not exist 18.2%
pkg/pipelinerunmetrics/metrics.go Do not exist 75.4%
pkg/reconciler/taskrun/controller.go 95.8% 95.0% -0.8
pkg/taskrunmetrics/injection.go Do not exist 18.2%
pkg/taskrunmetrics/metrics.go Do not exist 81.7%

@mattmoor
Copy link
Member Author

mattmoor commented Sep 1, 2021

/retest

the alpha tests are very flaky 😞

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @khrm as it affects #4201

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 1, 2021
@dlorenc
Copy link
Contributor

dlorenc commented Sep 1, 2021

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 1, 2021
@mattmoor
Copy link
Member Author

mattmoor commented Sep 1, 2021

Hmm, the race detector blew up, gonna take a quick look at where

@mattmoor
Copy link
Member Author

mattmoor commented Sep 1, 2021

    testing.go:1092: race detected during execution of test
FAIL
FAIL	github.com/tektoncd/pipeline/pkg/reconciler/taskrun	6.991s

I don't see the usual stack trace in the blended -v output, so I'm going to try and repro this locally with -race -count=10 🤞

@mattmoor
Copy link
Member Author

mattmoor commented Sep 1, 2021

Nothing at 10, going for 100 while I walk the dog :)

@mattmoor
Copy link
Member Author

mattmoor commented Sep 1, 2021

Well all I got was:

race: limit on 8128 simultaneously alive goroutines is exceeded, dying

Let's see if prow can repro it.

/retest

@mattmoor
Copy link
Member Author

mattmoor commented Sep 1, 2021

I guess not.

@tekton-robot tekton-robot merged commit b13fc13 into tektoncd:main Sep 1, 2021
@mattmoor mattmoor deleted the metrics-fake-injection branch September 1, 2021 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesnt merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants