Skip to content

Commit 76c4c0d

Browse files
authored
[opentelemetry-kube-stack]: Merge both collectors in a single one with leader election (#1735)
* opentelemetry-kube-stack: Merge both collectores in a single one with leader election Signed-off-by: Jorge Turrado <[email protected]> * update generted manifests Signed-off-by: Jorge Turrado <[email protected]> * update generted manifests Signed-off-by: Jorge Turrado <[email protected]> * update generted manifests Signed-off-by: Jorge Turrado <[email protected]> * bump operator Signed-off-by: Jorge Turrado <[email protected]> * extract leader election config Signed-off-by: Jorge Turrado <[email protected]> * extract leader election config Signed-off-by: Jorge Turrado <[email protected]> * extract leader election config Signed-off-by: Jorge Turrado <[email protected]> * extract leader election config Signed-off-by: Jorge Turrado <[email protected]> * extract leader election config Signed-off-by: Jorge Turrado <[email protected]> * Support a setup without leader election Signed-off-by: Jorge Turrado <[email protected]> * Support a setup without leader election Signed-off-by: Jorge Turrado <[email protected]> * remove not needed code from example Signed-off-by: Jorge Turrado <[email protected]> --------- Signed-off-by: Jorge Turrado <[email protected]>
1 parent 225389d commit 76c4c0d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+2322
-533
lines changed

charts/opentelemetry-kube-stack/Chart.lock

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,4 +15,4 @@ dependencies:
1515
repository: https://prometheus-community.github.io/helm-charts
1616
version: 4.37.3
1717
digest: sha256:6bdd281bcc9df8f34dc4d553974a4ceeb9b967e10933648eecbed4f7b32570a1
18-
generated: "2025-07-14T12:56:09.098151354+02:00"
18+
generated: "2025-07-15T08:46:49.093988+02:00"

charts/opentelemetry-kube-stack/Chart.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
apiVersion: v2
22
name: opentelemetry-kube-stack
3-
version: 0.6.3
3+
version: 0.7.0
44
description: |
55
OpenTelemetry Quickstart chart for Kubernetes.
66
Installs an operator and collector for an easy way to get started with Kubernetes observability.

charts/opentelemetry-kube-stack/README.md

Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,15 @@ This Helm chart serves as a quickstart for OpenTelemetry in a Kubernetes environ
1111

1212
## Features
1313

14-
This chart installs the OpenTelemetry Operator and two collector pools with the following features:
15-
* Daemonset collector
16-
* Kubernetes infrastructure metrics
17-
* Applications logs
18-
* OTLP trace receiver
19-
* Kubernetes resource enrichment
20-
* Standalone collector
21-
* Kubernetes events
22-
* Cluster metrics
14+
This chart installs the OpenTelemetry Operator and a daemonset collector pool with the following features:
15+
* Kubernetes infrastructure metrics
16+
* Applications logs
17+
* OTLP trace receiver
18+
* Kubernetes resource enrichment
19+
* Kubernetes events
20+
* Cluster metrics
21+
22+
**Note**: This setup requires the usage of leader election extension, if this isn't posible for any reason, this extension can be avoided with [this alternative setup](/charts/opentelemetry-kube-stack/examples/no-leader-election-extension/README.md)
2323

2424
## Usage
2525

@@ -139,6 +139,11 @@ Consult also the [Helm Documentation on CRDs](https://helm.sh/docs/chart_best_pr
139139

140140
_See [helm upgrade](https://helm.sh/docs/helm/helm_upgrade/) for command documentation._
141141

142+
### Upgrade from 0.6.x to 0.7.x
143+
144+
Version 0.7.0 has unified the previous collectors (daemonset and deployment) in a single one. If you are using custom configurations for `cluster` collector, you will need to merge your `cluster` collector configuration with `daemon` collector and remove `collectors.cluster` section from your values file.
145+
If you are using helm, upgrade command is enough the prune old resources, but gitops approaches like 'ArgoCD' could require to select pruning options during sync process to get rid of removed resources.
146+
142147
## Configuration
143148

144149
The following command will show all the configurable options with detailed comments.

charts/opentelemetry-kube-stack/examples/cloud-demo/rendered/bridge.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ kind: OpAMPBridge
55
metadata:
66
name: example
77
labels:
8-
helm.sh/chart: opentelemetry-kube-stack-0.6.3
8+
helm.sh/chart: opentelemetry-kube-stack-0.7.0
99
app.kubernetes.io/version: "0.127.0"
1010
app.kubernetes.io/managed-by: Helm
1111
release: "example"

charts/opentelemetry-kube-stack/examples/cloud-demo/rendered/clusterrole.yaml

Lines changed: 12 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,18 @@ rules:
5252
- nonResourceURLs: ["/metrics", "/metrics/cadvisor"]
5353
verbs: ["get"]
5454

55+
- verbs:
56+
- get
57+
- list
58+
- watch
59+
- create
60+
- update
61+
- patch
62+
- delete
63+
apiGroups:
64+
- coordination.k8s.io
65+
resources:
66+
- leases
5567
- apiGroups:
5668
- ""
5769
resources:
@@ -135,21 +147,6 @@ rules:
135147
# Source: opentelemetry-kube-stack/templates/clusterrole.yaml
136148
apiVersion: rbac.authorization.k8s.io/v1
137149
kind: ClusterRoleBinding
138-
metadata:
139-
name: example-cluster-stats
140-
roleRef:
141-
apiGroup: rbac.authorization.k8s.io
142-
kind: ClusterRole
143-
name: example-collector
144-
subjects:
145-
- kind: ServiceAccount
146-
# quirk of the Operator
147-
name: "example-cluster-stats-collector"
148-
namespace: default
149-
---
150-
# Source: opentelemetry-kube-stack/templates/clusterrole.yaml
151-
apiVersion: rbac.authorization.k8s.io/v1
152-
kind: ClusterRoleBinding
153150
metadata:
154151
name: example-daemon
155152
roleRef:

charts/opentelemetry-kube-stack/examples/cloud-demo/rendered/collector.yaml

Lines changed: 36 additions & 180 deletions
Original file line numberDiff line numberDiff line change
@@ -2,190 +2,11 @@
22
# Source: opentelemetry-kube-stack/templates/collector.yaml
33
apiVersion: opentelemetry.io/v1beta1
44
kind: OpenTelemetryCollector
5-
metadata:
6-
name: example-cluster-stats
7-
namespace: default
8-
labels:
9-
helm.sh/chart: opentelemetry-kube-stack-0.6.3
10-
app.kubernetes.io/version: "0.127.0"
11-
app.kubernetes.io/managed-by: Helm
12-
release: "example"
13-
opentelemetry.io/opamp-reporting: "true"
14-
spec:
15-
managementState: managed
16-
mode: deployment
17-
config:
18-
exporters:
19-
debug: {}
20-
otlp:
21-
endpoint: ingest.example.com:443
22-
headers:
23-
access-token: ${ACCESS_TOKEN}
24-
processors:
25-
batch:
26-
send_batch_max_size: 1500
27-
send_batch_size: 1000
28-
timeout: 1s
29-
k8sattributes:
30-
extract:
31-
labels:
32-
- from: pod
33-
key: app.kubernetes.io/name
34-
tag_name: service.name
35-
- from: pod
36-
key: k8s-app
37-
tag_name: service.name
38-
- from: pod
39-
key: app.kubernetes.io/instance
40-
tag_name: k8s.app.instance
41-
- from: pod
42-
key: app.kubernetes.io/version
43-
tag_name: service.version
44-
- from: pod
45-
key: app.kubernetes.io/component
46-
tag_name: k8s.app.component
47-
metadata:
48-
- k8s.namespace.name
49-
- k8s.pod.name
50-
- k8s.pod.uid
51-
- k8s.node.name
52-
- k8s.pod.start_time
53-
- k8s.deployment.name
54-
- k8s.replicaset.name
55-
- k8s.replicaset.uid
56-
- k8s.daemonset.name
57-
- k8s.daemonset.uid
58-
- k8s.job.name
59-
- k8s.job.uid
60-
- k8s.container.name
61-
- k8s.cronjob.name
62-
- k8s.statefulset.name
63-
- k8s.statefulset.uid
64-
- container.image.tag
65-
- container.image.name
66-
- k8s.cluster.uid
67-
passthrough: false
68-
pod_association:
69-
- sources:
70-
- from: resource_attribute
71-
name: k8s.pod.uid
72-
- sources:
73-
- from: resource_attribute
74-
name: k8s.pod.name
75-
- from: resource_attribute
76-
name: k8s.namespace.name
77-
- from: resource_attribute
78-
name: k8s.node.name
79-
- sources:
80-
- from: resource_attribute
81-
name: k8s.pod.ip
82-
- sources:
83-
- from: resource_attribute
84-
name: k8s.pod.name
85-
- from: resource_attribute
86-
name: k8s.namespace.name
87-
- sources:
88-
- from: connection
89-
resourcedetection/env:
90-
detectors:
91-
- env
92-
override: false
93-
timeout: 2s
94-
receivers:
95-
k8s_cluster:
96-
allocatable_types_to_report:
97-
- cpu
98-
- memory
99-
- storage
100-
auth_type: serviceAccount
101-
collection_interval: 10s
102-
node_conditions_to_report:
103-
- Ready
104-
- MemoryPressure
105-
- DiskPressure
106-
- NetworkUnavailable
107-
k8sobjects:
108-
objects:
109-
- exclude_watch_type:
110-
- DELETED
111-
group: events.k8s.io
112-
mode: watch
113-
name: events
114-
service:
115-
pipelines:
116-
logs:
117-
exporters:
118-
- debug
119-
processors:
120-
- k8sattributes
121-
- resourcedetection/env
122-
- batch
123-
receivers:
124-
- k8sobjects
125-
metrics:
126-
exporters:
127-
- debug
128-
- otlp
129-
processors:
130-
- k8sattributes
131-
- resourcedetection/env
132-
- batch
133-
receivers:
134-
- k8s_cluster
135-
replicas: 1
136-
imagePullPolicy: IfNotPresent
137-
upgradeStrategy: automatic
138-
terminationGracePeriodSeconds: 30
139-
resources:
140-
limits:
141-
cpu: 200m
142-
memory: 500Mi
143-
requests:
144-
cpu: 100m
145-
memory: 250Mi
146-
securityContext:
147-
{}
148-
env:
149-
- name: OTEL_K8S_NODE_NAME
150-
valueFrom:
151-
fieldRef:
152-
fieldPath: spec.nodeName
153-
- name: OTEL_K8S_NODE_IP
154-
valueFrom:
155-
fieldRef:
156-
fieldPath: status.hostIP
157-
- name: OTEL_K8S_NAMESPACE
158-
valueFrom:
159-
fieldRef:
160-
apiVersion: v1
161-
fieldPath: metadata.namespace
162-
- name: OTEL_K8S_POD_NAME
163-
valueFrom:
164-
fieldRef:
165-
apiVersion: v1
166-
fieldPath: metadata.name
167-
- name: OTEL_K8S_POD_IP
168-
valueFrom:
169-
fieldRef:
170-
apiVersion: v1
171-
fieldPath: status.podIP
172-
- name: OTEL_RESOURCE_ATTRIBUTES
173-
value: "k8s.cluster.name=demo"
174-
175-
- name: ACCESS_TOKEN
176-
valueFrom:
177-
secretKeyRef:
178-
key: access_token
179-
name: otel-collector-secret
180-
---
181-
# Source: opentelemetry-kube-stack/templates/collector.yaml
182-
apiVersion: opentelemetry.io/v1beta1
183-
kind: OpenTelemetryCollector
1845
metadata:
1856
name: example-daemon
1867
namespace: default
1878
labels:
188-
helm.sh/chart: opentelemetry-kube-stack-0.6.3
9+
helm.sh/chart: opentelemetry-kube-stack-0.7.0
18910
app.kubernetes.io/version: "0.127.0"
19011
app.kubernetes.io/managed-by: Helm
19112
release: "example"
@@ -200,6 +21,15 @@ spec:
20021
endpoint: ingest.example.com:443
20122
headers:
20223
access-token: ${ACCESS_TOKEN}
24+
extensions:
25+
k8s_leader_elector/k8s_cluster:
26+
auth_type: serviceAccount
27+
lease_name: k8s.cluster.receiver.opentelemetry.io
28+
lease_namespace: default
29+
k8s_leader_elector/k8s_objects:
30+
auth_type: serviceAccount
31+
lease_name: k8s.objects.receiver.opentelemetry.io
32+
lease_namespace: default
20333
processors:
20434
batch:
20535
send_batch_max_size: 1500
@@ -341,6 +171,27 @@ spec:
341171
system.memory.utilization:
342172
enabled: true
343173
network: {}
174+
k8s_cluster:
175+
allocatable_types_to_report:
176+
- cpu
177+
- memory
178+
- storage
179+
auth_type: serviceAccount
180+
collection_interval: 10s
181+
k8s_leader_elector: k8s_leader_elector/k8s_cluster
182+
node_conditions_to_report:
183+
- Ready
184+
- MemoryPressure
185+
- DiskPressure
186+
- NetworkUnavailable
187+
k8sobjects:
188+
k8s_leader_elector: k8s_leader_elector/k8s_objects
189+
objects:
190+
- exclude_watch_type:
191+
- DELETED
192+
group: events.k8s.io
193+
mode: watch
194+
name: events
344195
kubeletstats:
345196
auth_type: serviceAccount
346197
collection_interval: 15s
@@ -548,6 +399,9 @@ spec:
548399
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
549400
insecure_skip_verify: true
550401
service:
402+
extensions:
403+
- k8s_leader_elector/k8s_objects
404+
- k8s_leader_elector/k8s_cluster
551405
pipelines:
552406
logs:
553407
exporters:
@@ -560,6 +414,7 @@ spec:
560414
receivers:
561415
- otlp
562416
- filelog
417+
- k8sobjects
563418
metrics:
564419
exporters:
565420
- debug
@@ -573,6 +428,7 @@ spec:
573428
- otlp
574429
- hostmetrics
575430
- kubeletstats
431+
- k8s_cluster
576432
traces:
577433
exporters:
578434
- debug

charts/opentelemetry-kube-stack/examples/cloud-demo/rendered/hooks.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,4 +62,4 @@ spec:
6262
- -c
6363
- |
6464
kubectl delete instrumentations,opampbridges,opentelemetrycollectors \
65-
-l helm.sh/chart=opentelemetry-kube-stack-0.6.3
65+
-l helm.sh/chart=opentelemetry-kube-stack-0.7.0

charts/opentelemetry-kube-stack/examples/cloud-demo/rendered/instrumentation.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ kind: Instrumentation
55
metadata:
66
name: example
77
labels:
8-
helm.sh/chart: opentelemetry-kube-stack-0.6.3
8+
helm.sh/chart: opentelemetry-kube-stack-0.7.0
99
app.kubernetes.io/version: "0.127.0"
1010
app.kubernetes.io/managed-by: Helm
1111
release: "example"
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# No leader election extension example
2+
This example contains files to allow a user to use 2 different collectors to monitor the cluster metrics and k8s api info when leader election extension is not an option (because the collector compillation doesn't include it or because of RBAC limitations).
3+
This can be done configuring a second collector to host the receivers and setting `disableLeaderElection: true` for `kubernetesEvents` and `clusterMetrics` presets.
4+
5+
**Disclaimer**: This setup is functional but k8s API metrics and events receivers **ARE NOT in a High Availability** configuration as both run as part of a second collector deployed as `deployment` with single replica to avoid the necessity of using leader election in any way.

0 commit comments

Comments
 (0)