Skip to content

[exporterhelper] num_consumers config is ignored when batch config is set #13672

@jddarby

Description

@jddarby

Component(s)

exporter/exporterhelper

What happened?

Describe the bug
When testing out my collector build, I initiated a disconnect from my telemetry backend to ensure telemetry wasn't dropped, this resulted in a significant increase in the collectors memory utilization.

Using the pprof extension I noticed that the number of goroutines proliferates well beyond the value of num_consumers I set in my config. Removing the batch config, the number of goroutines is capped at the expected value.

Steps to reproduce

  1. Start collector process which uses an exporter based on the exporterhelper (e.g. otlpexporter is what I am using) and is also using the file storage extension
  2. Use the following configuration for the exporter:
exporters:
  otlp:
    compression: gzip
    endpoint: <your_endpoint>
    retry_on_failure:
      enabled: true
      max_elapsed_time: 0
    sending_queue:
      enabled: true
      num_consumers: 10
      queue_size: 10000
      storage: file_storage
      sizer: bytes
      batch:
        flush_timeout: 10s
    tls:
      insecure: true
  1. Initiate some event which will cause retriable errors in the exporter (e.g. disconnect)
  2. Use pprof extension or other tooling to observe increase in goroutines beyond the value set by num_consumers and increase in memory usage of the process.

What did you expect to see?
Expect resource utilization for the collector to be stable and the number of goroutines created by exporterhelper to be capped at 10. Here is a screenshot of the pprof output without the batch config:

Image

What did you see instead?
The number of goroutines proliferates rapidly and memory usage goes up with it. Here is the pprof output with the batch config:

Image

Collector version

v0.132.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

# Using this config will demonstrate the issue

extensions:
  pprof:
  file_storage:
    directory: ./file_storage

receivers:
  otlp:
    protocols:
      http:
        endpoint: localhost:4318

exporters:
  otlp:
    compression: gzip
    endpoint: localhost:4317
    retry_on_failure:
      enabled: true
      max_elapsed_time: 0
    sending_queue:
      enabled: true
      num_consumers: 10
      queue_size: 1e9 # 1GB
      storage: file_storage
      sizer: bytes
      batch:
        flush_timeout: 10s
    tls:
      insecure: true

service:
  extensions:
    - pprof
    - file_storage
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [otlp]
  telemetry:
    logs:
      disable_caller: false
      disable_stacktrace: false
      encoding: json
      level: info
    metrics:
      readers:
        - periodic:
            interval: 10
            exporter:
              otlp:
                protocol: http/protobuf
                endpoint: http://localhost:4318

Log output

Additional context

Let me know if there is any further information you would like me to provide.

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions