ebpf: replace binary.Read
with binary.Decode
in sysenc.Unmarshal
#1713
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello team,
First off, thanks for the amazing work on this project! We use it extensively at Grafana, and we truly appreciate the effort that goes into maintaining it.
This PR removes unnecessary allocations in
sysenc.Unmarshal
caused by thebinary.Read
call. We propose usingbinary.Decode
(introduced in Go 1.23) as an alternative. While the change may appear minor, its impact can be substantial depending on the specific use case.In the flame graphs below, you can observe one of our services experiencing significant GC pressure, with around 30% of the CPU time spent in
runtime.scanobject
. The heap allocations profile further highlights the issue, tracing it to thesysenc.Unmarshal
call within theebpf.(*Map).Lookup
->ebpf.unmarshalPerCPUValue
call chain.CPU profile:

Heap allocs profile (alloc_objects):

The profiles were collected over the course of an hour from a fleet of observability agents using Pyroscope and the standard Go pprof endpoints (we also use cilium/ebpf for profiling, though not in this particular case).
Nearly half of all the allocations made by the program occur due to
binary.Read
insysenc.Unmarshal
. Whilesysenc.Unmarshal
is generally well-optimized and only resorts tobinary.Read
as a last step, in our specific case, we miss out on most of these optimizations.The exact locations where allocations occur:
sync.Pool
d := &decoder
is stack-allocated inbinary.Decode
Benchmarks confirm the presence of an allocation in
sysenc.Unmarshal
. After applying the optimization from this PR, the benchmark results show zero allocations.Benchmark Results
Thank you for considering this change. Please let me know if you need any additional information or clarifications.