-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
Component(s)
connector/spanmetrics
What happened?
Description
We are generating spanmetrics by running otel-collector as statefulset behind a loadbalancing exporter with routing_key
as service
. The value of span_metrics_calls_total
gives appropriate value until the time the collector is restarted. So, when we restart the collector, either the span_metrics_calls_total
metric value shows a bump or a spike on the graph. This gives unpleasant impression that something is wrong in the service due to which calls are reduced or increased to the service.
Steps to Reproduce
Send the traces to a LoadBalancing Exporter collector running as deployment, then forward the traces from the LoadBalancing collector to another collector running as statefulset. Use routing_key
as service
.
Expected Result
The calls_total metric shouldn't show bump or spike when the otel-collector restarts.
Actual Result



Collector version
0.120.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
OpenTelemetry Collector configuration
mode: "statefulset"
config:
exporters:
debug/spanmetrics:
verbosity: basic
prometheusremotewrite/spanmetrics:
endpoint: http://victoria-metrics-cluster-vminsert.metrics.svc.cluster.local:8480/insert/10/prometheus
resource_to_telemetry_conversion:
enabled: true
timeout: 60s
compression: gzip
tls:
insecure_skip_verify: true
extensions:
health_check:
endpoint: ${env:MY_POD_IP}:13133
connectors:
spanmetrics:
histogram:
explicit:
buckets: [1ms, 10ms, 20ms, 50ms, 100ms, 250ms, 500ms, 800, 1s, 2s, 5s, 10s, 15s]
aggregation_temporality: "AGGREGATION_TEMPORALITY_CUMULATIVE"
dimensions:
- name: http.method
- name: http.status_code
dimensions_cache_size: 1000
events:
enabled: true
dimensions:
- name: exception.type
exclude_dimensions: ['k8s.pod.uid', 'k8s.pod.name', 'k8s.container.name', 'k8s.deployment.name', 'k8s.deployment.uid', 'k8s.job.name', 'k8s.job.uid', 'k8s.namespace.name', 'k8s.node.name', 'k8s.pod.ip', 'k8s.pod.start_time', 'k8s.replicaset.name', 'k8s.replicaset.uid', 'azure.vm.scaleset.name', 'cloud.resource_id', 'host.id', 'host.type', 'instance', 'service.instance.id', 'host.name', 'job', 'dt.entity.host', 'dt.entity.process_group', 'dt.entity.process_group_instance', 'container.id']
exemplars:
enabled: true
max_per_data_point: 5
metrics_flush_interval: 1m
metrics_expiration: 5m
namespace: span.metrics
resource_metrics_key_attributes:
- service.name
- telemetry.sdk.language
- telemetry.sdk.name
processors:
batch: {}
batch/spanmetrics:
send_batch_max_size: 5000
send_batch_size: 4500
timeout: 10s
memory_limiter:
check_interval: 5s
limit_percentage: 80
spike_limit_percentage: 25
receivers:
otlp/traces:
protocols:
http:
endpoint: ${env:MY_POD_IP}:4318
grpc:
endpoint: ${env:MY_POD_IP}:4317
max_recv_msg_size_mib: 12
service:
extensions:
- health_check
pipelines:
metrics/spanmetrics:
exporters:
- prometheusremotewrite/spanmetrics
processors:
- batch/spanmetrics
receivers:
- spanmetrics
traces/connector-pipeline:
exporters:
- spanmetrics
processors:
- batch
receivers:
- otlp/traces
telemetry:
metrics:
address: ${env:MY_POD_IP}:8888
Log output
Additional context
No response