-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Component(s)
exporter/exporterhelper
What happened?
Describe the bug
When testing out my collector build, I initiated a disconnect from my telemetry backend to ensure telemetry wasn't dropped, this resulted in a significant increase in the collectors memory utilization.
Using the pprof
extension I noticed that the number of goroutines proliferates well beyond the value of num_consumers
I set in my config. Removing the batch
config, the number of goroutines is capped at the expected value.
Steps to reproduce
- Start collector process which uses an exporter based on the
exporterhelper
(e.g.otlpexporter
is what I am using) and is also using the file storage extension - Use the following configuration for the exporter:
exporters:
otlp:
compression: gzip
endpoint: <your_endpoint>
retry_on_failure:
enabled: true
max_elapsed_time: 0
sending_queue:
enabled: true
num_consumers: 10
queue_size: 10000
storage: file_storage
sizer: bytes
batch:
flush_timeout: 10s
tls:
insecure: true
- Initiate some event which will cause retriable errors in the exporter (e.g. disconnect)
- Use
pprof
extension or other tooling to observe increase in goroutines beyond the value set bynum_consumers
and increase in memory usage of the process.
What did you expect to see?
Expect resource utilization for the collector to be stable and the number of goroutines created by exporterhelper
to be capped at 10. Here is a screenshot of the pprof
output without the batch
config:

What did you see instead?
The number of goroutines proliferates rapidly and memory usage goes up with it. Here is the pprof
output with the batch config:

Collector version
v0.132.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
OpenTelemetry Collector configuration
# Using this config will demonstrate the issue
extensions:
pprof:
file_storage:
directory: ./file_storage
receivers:
otlp:
protocols:
http:
endpoint: localhost:4318
exporters:
otlp:
compression: gzip
endpoint: localhost:4317
retry_on_failure:
enabled: true
max_elapsed_time: 0
sending_queue:
enabled: true
num_consumers: 10
queue_size: 1e9 # 1GB
storage: file_storage
sizer: bytes
batch:
flush_timeout: 10s
tls:
insecure: true
service:
extensions:
- pprof
- file_storage
pipelines:
metrics:
receivers: [otlp]
exporters: [otlp]
telemetry:
logs:
disable_caller: false
disable_stacktrace: false
encoding: json
level: info
metrics:
readers:
- periodic:
interval: 10
exporter:
otlp:
protocol: http/protobuf
endpoint: http://localhost:4318
Log output
Additional context
Let me know if there is any further information you would like me to provide.
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1
or me too
, to help us triage it. Learn more here.