metric tekton_pipelines_controller_pipelinerun_duration_seconds suddenly stops reporting

# Expected Behavior

We expect the metric `tekton_pipelines_controller_pipelinerun_duration_seconds` to always (consistently, e.g. for every single scrape request) report the value for every single PipelineRun as long as the PipelineRun exists in k8s when using the `lastvalue` setting (see provided config at end).

# Actual Behavior
While part of the initial scrapes, the values disappear over time. For example a PipelineRun that was started in the morning yields metrics for several hours, but then after a certain point in time, it yields no more metrics (verified by checking the /metrics endpoint of the pipelines-controller - default port 9090).

A picture says more than a thousand words:
![image](https://github.com/tektoncd/pipeline/assets/24916535/4700786b-eb6d-4248-830c-30a7025511dd)

When the metrics are visualized in prometheus (picture above) you would believe that during the gap in the middle - for a duration of around 30 minutes, there was no PipelineRun in the cluster. This is not true! There were plenty, it is just that they are no longer contained in the metrics output.

# Steps to Reproduce the Problem

1. configure metrics to use the `lastvalue` setting (like in the example provided at the bottom of this post)
2. recommended: also set up prometheus to scrape them, easier to visualize
3. produce plenty of PipelineRuns throughout the day
4. do some cleanups of PipelineRuns throughout the day - but never go to zero. We do cleanups like this, but potentially this is also reproducible without cleanups!
5. Find the ends of the timerseries.
   If you followed step 2. you essentially just need to look at the graph from prometheus.
   If you find a gap like in the picture above, the issue is reproduced. (Clarification: It looks like a gap, it is not an actual gap, since an actual gap would mean the same time series is continued later, which it is not - those are new PipelineRuns - new time series!)
   Otherwise (not using prometheus), the procedure is: for each pipelinerun in k8s, check if it is also part of the latest scrape.
   If an instance is found that is in k8s, but not in the metrics output, the problem is already reproduced.

# Additional Info

- Kubernetes version:

  **Output of `kubectl version`:**

```
<pre>Client Version: version.Info{Major:&quot;1&quot;, Minor:&quot;27&quot;, GitVersion:&quot;v1.27.3&quot;, GitCommit:&quot;25b4e43193bcda6c7328a6d147b1fb73a33f1598&quot;, GitTreeState:&quot;clean&quot;, BuildDate:&quot;2023-06-14T09:53:42Z&quot;, GoVersion:&quot;go1.20.5&quot;, Compiler:&quot;gc&quot;, Platform:&quot;linux/amd64&quot;}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:&quot;1&quot;, Minor:&quot;27&quot;, GitVersion:&quot;v1.27.10&quot;, GitCommit:&quot;0fa26aea1d5c21516b0d96fea95a77d8d429912e&quot;, GitTreeState:&quot;clean&quot;, BuildDate:&quot;2024-01-17T13:38:41Z&quot;, GoVersion:&quot;go1.20.13&quot;, Compiler:&quot;gc&quot;, Platform:&quot;linux/amd64&quot;}
</pre>
```

- Tekton Pipeline version:

  **Output of `tkn version` or `kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'`**

```
Client version: 0.36.0
Chains version: v0.20.0
Pipeline version: v0.56.1
Triggers version: v0.26.1
Dashboard version: v0.43.1
Operator version: v0.70.0
```

- Config info:
  We used the following tekton operator settings on the `pipeline`:
  ```
  metrics.count.enable-reason: false
  metrics.pipelinerun.duration-type: lastvalue
  metrics.pipelinerun.level: pipelinerun
  metrics.taskrun.duration-type: lastvalue
  metrics.taskrun.level: taskrun
  ```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

metric tekton_pipelines_controller_pipelinerun_duration_seconds suddenly stops reporting #7902

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Additional Info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

metric tekton_pipelines_controller_pipelinerun_duration_seconds suddenly stops reporting #7902

Description

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Additional Info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions