Skip to content

[Proposal] Move batching to exporterhelper #4646

@bogdandrutu

Description

@bogdandrutu

Right now batching capability is offered as a processor, and this is extremely generic in our current pipeline model, which is a great thing, but also comes with limitations:

  1. Batch processor uses a queue (implemented as a channel), which is not the same as the one used in the queue retry, that can lose data when the binary crashes even after the client was informed that data were accepted. There is already work to offer a persistent queue mechanism which will not be available in the processor.
  2. If users want to use batching with the routing processor: will need to fix the Change batch processor to take client info into consideration #4544 issue, AND this will batch data that later will be split again. Very unnecessary.
  3. SignalFx/SplunkHec receiver/exporter uses a "hack" that adds one header from the incoming request to the resource, batches things, then has to split based on that header. Batching inside the exporter helper can be configured with possible custom logic much easier, by making the batching library accepting a custom "batcher" func.
  4. Allow customization of the "size" function. There were requests to support batching by the serialized size, we can offer this in the exporter helper much nicer since the custom sizer can actually work using the exporter wire format. So batching can happen after data serialization.
  5. When multiple exporters configured, they may have different batching requirements. This can be achieved today with multiple pipelines, but that causes code duplicate between the pipelines.

This will also fix some more problems:

The proposal is to look into offer the "batching" capability similar to timeout/queue/retry logic that we have in the exporterhelper.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions