-
Notifications
You must be signed in to change notification settings - Fork 10
[PROF-12990] Improve locking and prevent use-after-free in FlightRecorder #291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
jbachorik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
|
Hm, maybe update the PR description which is still mentioning switch to |
Benchmarks [x86_64 wall]Parameters
See matching parameters
SummaryFound 1 performance improvements and 1 performance regressions! Performance is the same for 14 metrics, 22 unstable metrics.
|
Benchmarks [x86_64 cpu]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics. |
Benchmarks [x86_64 alloc]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics. |
Benchmarks [x86_64 cpu,wall,alloc,memleak]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics. |
Benchmarks [x86_64 memleak]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 24 unstable metrics. |
Benchmarks [x86_64 memleak,alloc]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 23 unstable metrics. |
Benchmarks [aarch64 cpu]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 16 metrics, 22 unstable metrics. |
Benchmarks [x86_64 cpu,wall]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 14 metrics, 24 unstable metrics. |
Benchmarks [aarch64 alloc]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics. |
Benchmarks [aarch64 memleak,alloc]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics. |
Benchmarks [aarch64 wall]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics. |
Benchmarks [aarch64 cpu,wall,alloc,memleak]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics. |
Benchmarks [aarch64 memleak]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 16 metrics, 22 unstable metrics. |
Benchmarks [aarch64 cpu,wall]Parameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 21 unstable metrics. |
| _rec->switchChunk(copy_fd); | ||
| close(copy_fd); | ||
| } else { | ||
| OptionalSharedLockGuard locker(&_rec_lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why shared lock? which means wallClockEpoch(), recordTraceRoot() and etc. can also write to recorder, is it safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all very 'convoluted' (to put it nicely).
The _rec_locker is supposed to guard only modifications to _rec. The concurrent access to the actual recording writes is handled one level up, in Profiler class, where we have a striped lock used for getting access to the recording. The methods writing one event, like wallClockEpoch() will grab one stripe to perform the write. The dump() and flush() will lock all stripes, making those operations exclusive agains the event writing ones.
So, a shared lock here is fine. However, it should be a normal shared lock and optional, since we do not want to skip the 'dump' operation just because an event write is in progress. The truth is, this never happens, thanks to the upper level locking in Profiler, but using the shared lock better communicates the intention (and we don't need to check for ownership).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It reads _lock_rec, so it only needs to acquire the shared lock (aka read-lock). Whatever happens in Recorder should be protected there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only needs to acquire the shared lock
Correct. Just not the optional, because we do not want to skip 'dump' because of ongoing recording operation, but rather wait till it's done.
|
I am not sure that following is correct.
While |
| int value; | ||
| // we use relaxed as the compare already offers the guarantees we need | ||
| while ((value = __atomic_load_n(&_lock, __ATOMIC_RELAXED)) <= 0) { | ||
| while ((value = __atomic_load_n(&_lock, __ATOMIC_ACQUIRE)) <= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think acquire order provides anything, because if lock is successfully acquired, the order comes from __sync_bool_compare_and_swap, which is stronger. If fails, relaxed should be fine, since you are not supposed to touch the shared object that you failed to acquire.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but the first read comes before the CAS, and might save us from going into the CAS to begin with. I think it's a minor nuisance and mostly cosmetic change, and I've done it mostly for consistency.
Well yeah, but if there is a concurrent write, then no other read is allowed to happen, and if anybody is reading, then no concurrent write is allowed to happen. Right? |
It is shared lock, so, e.g.
It is a shared lock, so, e.g. |
Let me take a look again. The reader/writer semantic is related to modifying the |
Let's clarify the shared/optional semantics for flush() and dump() operations first
The semantics of the shared lock is that it is safe to read the _rec field, in other words, when we successfully acquire the shared lock, it is guaranteed that no other thread modifies that field. Shared-lock can be blocking: that happens when modification of the field is in progress. Usually, in the common scenario it is non-blocking, though. The semantics of optional lock is that it uses tryLockShared(). If this fails to acquire the lock, then it simply returns, without blocking. When that happens, it means another thread is currently modifying the _rec field. This happens only during start() and stop(). When that happens, we can pessimistically assume that we either have not yet set up a Recorder, or we have already destroyed it. In both cases, it seems ok to not attempt any operations on the Recorder. If we need to guarantee a particular ordering of start() stop() with e.g. dump() or flush(), then this needs to be done elsewhere, IMO. Optional locking is always non-blocking. |
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
The merge request has been interrupted because the build 0 took longer than expected. The current limit for the base branch 'main' is 120 minutes. |
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
The merge request has been interrupted because the build 0 took longer than expected. The current limit for the base branch 'main' is 120 minutes. |
What does this PR do?:
This change improves locking around accesses to the Recording* _rec field, and prevents possible use-after-free of that object.
I also made a small utility class OptionalSharedLockGuard, that allows to do try-lock-style locking in a scope.
I also changed the exclusive lock in dump() and flush() to use a shared-lock instead. This should be sufficient and it's also consistent: we use exclusive&blocking lock whenever we modify the _rec field, and we use shared&usually-non-blocking lock whenever we only read that field.
Motivation:
Make accesses to _rec more reliable and prevent use-after-free. See PROF-12990 for details.
How to test the change?:
Existing tests should cover the changes.
For Datadog employees:
credentials of any kind, I've requested a review from
@DataDog/security-design-and-guidance.