[PROF-12990] Improve locking and prevent use-after-free in FlightRecorder #291

rkennke · 2025-11-07T12:16:30Z

What does this PR do?:
This change improves locking around accesses to the Recording* _rec field, and prevents possible use-after-free of that object.

I also made a small utility class OptionalSharedLockGuard, that allows to do try-lock-style locking in a scope.

I also changed the exclusive lock in dump() and flush() to use a shared-lock instead. This should be sufficient and it's also consistent: we use exclusive&blocking lock whenever we modify the _rec field, and we use shared&usually-non-blocking lock whenever we only read that field.

Motivation:
Make accesses to _rec more reliable and prevent use-after-free. See PROF-12990 for details.

How to test the change?:
Existing tests should cover the changes.

For Datadog employees:

If this PR touches code that signs or publishes builds or packages, or handles
credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
This PR doesn't touch any of that.
JIRA: PROF-12990

…rder

ddprof-lib/src/main/cpp/spinLock.h

jbachorik

Looks good

jbachorik · 2025-11-11T14:37:20Z

Hm, maybe update the PR description which is still mentioning switch to std::atomic

pr-commenter · 2025-11-11T15:07:56Z

Benchmarks [x86_64 wall]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	wall	wall
wall	on	on

Summary

Found 1 performance improvements and 1 performance regressions! Performance is the same for 14 metrics, 22 unstable metrics.

scenario	Δ mean execution_time	Δ mean rss
scenario:renaissance:finagle-chirper	better [-1.537s; -0.695s] or [-5.320%; -2.404%]	unstable [-315.733MB; +312.726MB] or [-22.993%; +22.774%]
scenario:renaissance:future-genetic	worse [+242.600ms; +617.400ms] or [+1.563%; +3.977%]	unstable [-353.933MB; +349.735MB] or [-36.384%; +35.952%]

pr-commenter · 2025-11-11T15:08:45Z

Benchmarks [x86_64 cpu]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	on	on
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	cpu	cpu
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics.

pr-commenter · 2025-11-11T15:08:47Z

Benchmarks [x86_64 alloc]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	on	on
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	alloc	alloc
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics.

pr-commenter · 2025-11-11T15:09:01Z

Benchmarks [x86_64 cpu,wall,alloc,memleak]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	on	on
cpu	on	on
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	on	on
modes	cpu,wall,alloc,memleak	cpu,wall,alloc,memleak
wall	on	on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics.

pr-commenter · 2025-11-11T15:09:13Z

Benchmarks [x86_64 memleak]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	off	off
iterations	5	5
java
memleak	on	on
modes	memleak	memleak
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 24 unstable metrics.

pr-commenter · 2025-11-11T15:09:28Z

Benchmarks [x86_64 memleak,alloc]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	on	on
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	on	on
modes	memleak,alloc	memleak,alloc
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics.

pr-commenter · 2025-11-11T15:09:38Z

Benchmarks [aarch64 cpu]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	on	on
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	cpu	cpu
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 16 metrics, 22 unstable metrics.

pr-commenter · 2025-11-11T15:09:40Z

Benchmarks [x86_64 cpu,wall]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	on	on
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	cpu,wall	cpu,wall
wall	on	on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 24 unstable metrics.

pr-commenter · 2025-11-11T15:10:04Z

Benchmarks [aarch64 alloc]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	on	on
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	alloc	alloc
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics.

pr-commenter · 2025-11-11T15:11:39Z

Benchmarks [aarch64 memleak,alloc]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	on	on
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	on	on
modes	memleak,alloc	memleak,alloc
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics.

pr-commenter · 2025-11-11T15:11:40Z

Benchmarks [aarch64 wall]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	wall	wall
wall	on	on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics.

pr-commenter · 2025-11-11T15:11:47Z

Benchmarks [aarch64 cpu,wall,alloc,memleak]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	on	on
cpu	on	on
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	on	on
modes	cpu,wall,alloc,memleak	cpu,wall,alloc,memleak
wall	on	on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics.

pr-commenter · 2025-11-11T15:11:54Z

Benchmarks [aarch64 memleak]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	off	off
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	on	on
modes	memleak	memleak
wall	off	off

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 16 metrics, 22 unstable metrics.

pr-commenter · 2025-11-11T15:12:02Z

Benchmarks [aarch64 cpu,wall]

Parameters

	Baseline	Candidate
config	baseline	candidate
ddprof	1.34.3	1.35.0-rkennke_improve-atomics-SNAPSHOT

See matching parameters

	Baseline	Candidate
alloc	off	off
cpu	on	on
iterations	5	5
java	"11.0.28"	"11.0.28"
memleak	off	off
modes	cpu,wall	cpu,wall
wall	on	on

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics.

zhengyu123 · 2025-11-11T15:41:08Z

ddprof-lib/src/main/cpp/flightRecorder.cpp

-      _rec->switchChunk(copy_fd);
-      close(copy_fd);
-    } else {
+  OptionalSharedLockGuard locker(&_rec_lock);


Why shared lock? which means wallClockEpoch(), recordTraceRoot() and etc. can also write to recorder, is it safe?

This is all very 'convoluted' (to put it nicely).

The _rec_locker is supposed to guard only modifications to _rec. The concurrent access to the actual recording writes is handled one level up, in Profiler class, where we have a striped lock used for getting access to the recording. The methods writing one event, like wallClockEpoch() will grab one stripe to perform the write. The dump() and flush() will lock all stripes, making those operations exclusive agains the event writing ones.

So, a shared lock here is fine. However, it should be a normal shared lock and optional, since we do not want to skip the 'dump' operation just because an event write is in progress. The truth is, this never happens, thanks to the upper level locking in Profiler, but using the shared lock better communicates the intention (and we don't need to check for ownership).

It reads _lock_rec, so it only needs to acquire the shared lock (aka read-lock). Whatever happens in Recorder should be protected there.

only needs to acquire the shared lock

Correct. Just not the optional, because we do not want to skip 'dump' because of ongoing recording operation, but rather wait till it's done.

zhengyu123 · 2025-11-11T15:47:39Z

I am not sure that following is correct.

I also changed the exclusive lock in dump() and flush() to use a shared-lock instead. This should be sufficient and it's also consistent: we use exclusive&blocking lock whenever we modify the _rec field, and we use shared&usually-non-blocking lock whenever we only read that field.

While dump() and flush are reads, but there are concurrent writes.

zhengyu123 · 2025-11-11T15:58:24Z

ddprof-lib/src/main/cpp/spinLock.h

    int value;
-    // we use relaxed as the compare already offers the guarantees we need
-    while ((value = __atomic_load_n(&_lock, __ATOMIC_RELAXED)) <= 0) {
+    while ((value = __atomic_load_n(&_lock, __ATOMIC_ACQUIRE)) <= 0) {


I don't think acquire order provides anything, because if lock is successfully acquired, the order comes from __sync_bool_compare_and_swap, which is stronger. If fails, relaxed should be fine, since you are not supposed to touch the shared object that you failed to acquire.

True, but the first read comes before the CAS, and might save us from going into the CAS to begin with. I think it's a minor nuisance and mostly cosmetic change, and I've done it mostly for consistency.

rkennke · 2025-11-11T16:00:32Z

I am not sure that following is correct.

I also changed the exclusive lock in dump() and flush() to use a shared-lock instead. This should be sufficient and it's also consistent: we use exclusive&blocking lock whenever we modify the _rec field, and we use shared&usually-non-blocking lock whenever we only read that field.

While dump() and flush are read, but there are concurrent write.

Well yeah, but if there is a concurrent write, then no other read is allowed to happen, and if anybody is reading, then no concurrent write is allowed to happen. Right?

zhengyu123 · 2025-11-11T18:17:17Z

I am not sure that following is correct.
I also changed the exclusive lock in dump() and flush() to use a shared-lock instead. This should be sufficient and it's also consistent: we use exclusive&blocking lock whenever we modify the _rec field, and we use shared&usually-non-blocking lock whenever we only read that field.
While dump() and flush are read, but there are concurrent write.

Well yeah, but if there is a concurrent write, then no other read is allowed to happen, and if anybody is reading, then no concurrent write is allowed to happen. Right?

It is shared lock, so, e.g.

I am not sure that following is correct.
I also changed the exclusive lock in dump() and flush() to use a shared-lock instead. This should be sufficient and it's also consistent: we use exclusive&blocking lock whenever we modify the _rec field, and we use shared&usually-non-blocking lock whenever we only read that field.
While dump() and flush are read, but there are concurrent write.

Well yeah, but if there is a concurrent write, then no other read is allowed to happen, and if anybody is reading, then no concurrent write is allowed to happen. Right?

It is a shared lock, so, e.g. recordEvent() can enter as well to perform writes.

jbachorik · 2025-11-12T08:39:34Z

Well yeah, but if there is a concurrent write, then no other read is allowed to happen, and if anybody is reading, then no concurrent write is allowed to happen. Right?

Let me take a look again. The reader/writer semantic is related to modifying the _rec variable and not operations on it.

Let's clarify the shared/optional semantics for flush() and dump() operations first

rkennke · 2025-11-12T12:34:11Z

Well yeah, but if there is a concurrent write, then no other read is allowed to happen, and if anybody is reading, then no concurrent write is allowed to happen. Right?

Let me take a look again. The reader/writer semantic is related to modifying the _rec variable and not operations on it.

The semantics of the shared lock is that it is safe to read the _rec field, in other words, when we successfully acquire the shared lock, it is guaranteed that no other thread modifies that field. Shared-lock can be blocking: that happens when modification of the field is in progress. Usually, in the common scenario it is non-blocking, though.

The semantics of optional lock is that it uses tryLockShared(). If this fails to acquire the lock, then it simply returns, without blocking. When that happens, it means another thread is currently modifying the _rec field. This happens only during start() and stop(). When that happens, we can pessimistically assume that we either have not yet set up a Recorder, or we have already destroyed it. In both cases, it seems ok to not attempt any operations on the Recorder. If we need to guarantee a particular ordering of start() stop() with e.g. dump() or flush(), then this needs to be done elsewhere, IMO. Optional locking is always non-blocking.

rkennke · 2025-11-12T17:39:40Z

/merge

dd-devflow-routing-codex · 2025-11-12T17:39:45Z

View all feedbacks in Devflow UI.

2025-11-12 17:39:44 UTC ℹ️ Start processing command /merge

2025-11-12 17:39:49 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in main is approximately 17m (p90).

2025-11-12 19:40:05 UTC ❌ MergeQueue: The build pipeline has timeout

The merge request has been interrupted because the build 0 took longer than expected. The current limit for the base branch 'main' is 120 minutes.

rkennke · 2025-11-14T19:09:29Z

/merge

dd-devflow-routing-codex · 2025-11-14T19:09:33Z

View all feedbacks in Devflow UI.

2025-11-14 19:09:33 UTC ℹ️ Start processing command /merge

2025-11-14 19:09:37 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in main is approximately 17m (p90).

2025-11-14 21:09:53 UTC ❌ MergeQueue: The build pipeline has timeout

The merge request has been interrupted because the build 0 took longer than expected. The current limit for the base branch 'main' is 120 minutes.

rkennke added 6 commits November 7, 2025 12:20

[PROF-12990] Improve locking and prevent use-after-free in FlightReco…

dbb7673

…rder

Small compilation fixes.

272e65d

More compilation fixes.

eed1e89

Revert to using SpinLock.

237670a

Use std::atomic in SpinLock instead of GCC intrinsics.

d24b792

Merge remote-tracking branch 'origin/main' into rkennke/improve-atomics

895d612

rkennke marked this pull request as ready for review November 10, 2025 14:42

rkennke requested review from jbachorik and zhengyu123 and removed request for zhengyu123 November 10, 2025 16:44

jbachorik reviewed Nov 10, 2025

View reviewed changes

ddprof-lib/src/main/cpp/spinLock.h Outdated Show resolved Hide resolved

rkennke added 2 commits November 11, 2025 14:20

Don't execute code in the locker.

f7e4729

Don't use std::atomic - it is not guaranteed to be signal-safe.

b0c0531

jbachorik reviewed Nov 11, 2025

View reviewed changes

ddprof-lib/src/main/cpp/spinLock.h Show resolved Hide resolved

jbachorik previously approved these changes Nov 11, 2025

View reviewed changes

zhengyu123 reviewed Nov 11, 2025

View reviewed changes

Keep using explicit lock for ::dump() and ::flush() (for now)

592f4f9

jbachorik approved these changes Nov 12, 2025

View reviewed changes

dd-devflow bot added mergequeue-status: queued mergequeue-status: in_progress and removed mergequeue-status: queued labels Nov 12, 2025

zhengyu123 approved these changes Nov 12, 2025

View reviewed changes

dd-devflow bot added mergequeue-status: rejected and removed mergequeue-status: in_progress labels Nov 12, 2025

dd-devflow bot added mergequeue-status: queued mergequeue-status: in_progress mergequeue-status: rejected and removed mergequeue-status: rejected mergequeue-status: queued mergequeue-status: in_progress labels Nov 14, 2025

[PROF-12990] Improve locking and prevent use-after-free in FlightRecorder #291

Are you sure you want to change the base?

[PROF-12990] Improve locking and prevent use-after-free in FlightRecorder #291

Conversation

rkennke commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jbachorik left a comment

Choose a reason for hiding this comment

Uh oh!

jbachorik commented Nov 11, 2025

Uh oh!

pr-commenter bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [x86_64 wall]

Parameters

Summary

Uh oh!

pr-commenter bot commented Nov 11, 2025

Benchmarks [x86_64 cpu]

Parameters

Summary

Uh oh!

pr-commenter bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [x86_64 alloc]

Parameters

Summary

Uh oh!

pr-commenter bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [x86_64 cpu,wall,alloc,memleak]

Parameters

Summary

Uh oh!

pr-commenter bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [x86_64 memleak]

Parameters

Summary

Uh oh!

pr-commenter bot commented Nov 11, 2025

Benchmarks [x86_64 memleak,alloc]

Parameters

Summary

Uh oh!

pr-commenter bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [aarch64 cpu]

Parameters

Summary

Uh oh!

pr-commenter bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [x86_64 cpu,wall]

Parameters

Summary

Uh oh!

pr-commenter bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [aarch64 alloc]

Parameters

Summary

Uh oh!

pr-commenter bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [aarch64 memleak,alloc]

Parameters

Summary

Uh oh!

pr-commenter bot commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks [aarch64 wall]

Parameters

Summary

Uh oh!

pr-commenter bot commented Nov 11, 2025

Benchmarks [aarch64 cpu,wall,alloc,memleak]

rkennke commented Nov 7, 2025 •

edited

Loading

pr-commenter bot commented Nov 11, 2025 •

edited

Loading

pr-commenter bot commented Nov 11, 2025 •

edited

Loading

pr-commenter bot commented Nov 11, 2025 •

edited

Loading

pr-commenter bot commented Nov 11, 2025 •

edited

Loading

pr-commenter bot commented Nov 11, 2025 •

edited

Loading

pr-commenter bot commented Nov 11, 2025 •

edited

Loading

pr-commenter bot commented Nov 11, 2025 •

edited

Loading

pr-commenter bot commented Nov 11, 2025 •

edited

Loading

pr-commenter bot commented Nov 11, 2025 •

edited

Loading

pr-commenter bot commented Nov 11, 2025 •

edited

Loading

pr-commenter bot commented Nov 11, 2025 •

edited

Loading

zhengyu123 commented Nov 11, 2025 •

edited

Loading

zhengyu123 commented Nov 11, 2025 •

edited

Loading

dd-devflow-routing-codex bot commented Nov 12, 2025 •

edited

Loading

dd-devflow-routing-codex bot commented Nov 14, 2025 •

edited

Loading