Skip to content

Ballast processor seems to escape virtual memory and enter RSS #7512

@datsabk

Description

@datsabk

Component(s)

No response

What happened?

Description

Open telemetry agent goes around utilising over 6Gb of memory while scraping merely 6-7 pods on the node.

Steps to Reproduce

Use the provided configuration.

Expected Result

In agent mode, it should ideally use less than 1Gi memory to run smoothly on the agent

Actual Result

Uses over 6Gi memory at times. Have cases where it even crosses 64Gi memory if the node has that level of memory available. Looks like GC never runs

Collector version

0.69.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

exporters:
            logging: null
            prometheusremotewrite:
              endpoint: "https://aps-workspaces.us-west-2.amazonaws.com/workspaces/ws-xxx/api/v1/remote_write"
              external_labels:
                cluster_name: prodobs
              auth:
                authenticator: sigv4auth
            
          extensions:
            health_check: {}
            memory_ballast: 
              size_in_percentage: 33
            sigv4auth:
              assume_role:
                sts_region: "us-west-2"
          processors:
            batch: {}
          receivers:
            jaeger: null
            zipkin: null
            otlp:
              protocols:
                grpc:
                  endpoint: 0.0.0.0:4327
                http:
                  endpoint: 0.0.0.0:4328
            prometheus:
              config:
                scrape_configs:
                - job_name: pods-scraper
                  scrape_interval: 15s
                  kubernetes_sd_configs:
                    - role: pod
                  metric_relabel_configs:
                    - action: labeldrop
                      regex: prometheus_(.*)
                    - action: labeldrop
                      regex: beta_(.*)
                    - action: labeldrop
                      regex: (app_kubernetes_io_instance|url|pod_template_hash|prometheus|container_id)
                    - source_labels: [__name__]
                      regex: '(functions_.*|feature_.*|span_.*)'
                      action: drop
                  relabel_configs:
                    - source_labels: [__meta_kubernetes_pod_name]
                      regex: '(node-exporter|kube-state-metrics|yace|kube-dns-cache)-.*'
                      action: drop
                    - source_labels: [__meta_kubernetes_pod_node_name]
                      action: keep
                      regex: $NODE_NAME
                    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
                      separator: ;
                      regex: "true"
                      action: keep
                    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
                      separator: ;
                      regex: (.+)
                      target_label: __metrics_path__
                      replacement: $$1
                      action: replace
                    - action: replace
                      regex: ([^:]+)(?::\d+)?;(\d+)
                      replacement: $$1:$$2
                      source_labels:
                        - __address__
                        - __meta_kubernetes_pod_annotation_prometheus_io_port
                      target_label: __address__
                    - source_labels: [__meta_kubernetes_pod_container_name]
                      target_label: container
                    - action: labeldrop
                      regex: __meta_kubernetes_pod_label_k2_(.+)
                    - action: labelmap
                      regex: __meta_kubernetes_pod_label_(.+)
                    - source_labels: [__meta_kubernetes_namespace]
                      action: replace
                      target_label: kubernetes_namespace
                    - source_labels: [__meta_kubernetes_pod_name]
                      action: replace
                      target_label: pod
                    - action: drop
                      regex: '.*_kubernetes_io_.*'
                    - source_labels: [__meta_kubernetes_pod_node_name]
                      separator: ;
                      regex: (.*)
                      target_label: node
                      replacement: $$1
                      action: replace
                    - source_labels: [__meta_kubernetes_pod_label_cluster]
                      target_label: cluster
                    - source_labels: [quantile]
                      target_label: le
                    
          service:
            telemetry:
              metrics:
                address: 0.0.0.0:8888
            extensions:
              - sigv4auth
              - health_check
              - memory_ballast
            pipelines:
              logs: null
              traces: null
              metrics:
                exporters:
                  - prometheusremotewrite 
                processors:
                  - batch
                receivers:
                  - otlp
                  - prometheus

Log output

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions