-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
As of today, syzkaller is able to detect the ReadOnlyCoverage feature, but it is still not enabled by default.
I think this should be done by setting:
diff --git a/pkg/fuzzer/fuzzer.go b/pkg/fuzzer/fuzzer.go
index 0c0119e71..ded7e9213 100644
--- a/pkg/fuzzer/fuzzer.go
+++ b/pkg/fuzzer/fuzzer.go
@@ -469,6 +469,7 @@ func DefaultExecOpts(cfg *mgrconfig.Config, features flatrpc.Feature, debug bool
panic(fmt.Sprintf("failed to parse sandbox: %v", err))
}
env |= sandbox
+ env |= flatrpc.ExecEnvReadOnlyCoverage
exec := flatrpc.ExecFlagThreaded
if !cfg.RawCover {
But there is an important invariant that is violated if we do so: namely, KCOV_ENABLE
and KCOV_REMOTE_ENABLE
pin the kcov instance to the current thread, making it impossible to change it from other threads.
Currently there are three types of resets performed by the executor:
- Resetting the current thread's coverage in
execute_call()
atsyzkaller/executor/executor.cc
Line 1561 in 4bd24a3
cover_reset(&th->cov); - Resetting another thread's coverage in
schedule_call()
atsyzkaller/executor/executor.cc
Line 1211 in 4bd24a3
cover_reset(&th->cov); - Resetting the extra coverage in
execute_one()
atsyzkaller/executor/executor.cc
Line 963 in 4bd24a3
cover_reset(&extra_cov); write_extra_output()
syzkaller/executor/executor.cc
Line 1450 in 4bd24a3
cover_reset(&extra_cov);
The first case is perfectly safe. The second one can be easily fixed by asking the target thread to reset the coverage before executing the call:
@@ -1164,8 +1169,9 @@ thread_t* schedule_call(int call_index, int call_num, uint64 copyout_index, uint
// which overlaps with comparison type in kernel exposed records. As the result write_comparisons
// that will try to write out data from unfinished syscalls will see these rpc::ComparisonRaw records,
// mis-interpret PC as type, and fail as: SYZFAIL: invalid kcov comp type (type=ffffffff8100b4e0).
- if (flag_coverage)
- cover_reset(&th->cov);
+ if (flag_coverage) {
+ th->want_reset_cov = true;
+ }
th->executing = true;
th->call_index = call_index;
th->call_num = call_num;
@@ -1487,6 +1494,10 @@ void* worker_thread(void* arg)
// because in snapshot mode we don't know coverage mode for precreated threads.
if (first && cover_collection_required())
cover_enable(&th->cov, flag_comparisons, false);
+ if (!first && th->want_reset_cov) {
+ cover_reset(&th->cov);
+ th->want_reset_cov = false;
+ }
execute_call(th);
event_set(&th->done);
}
Now we are down to a single case of the worker thread (?) requesting a reset of extra_cov
created by the main thread.
As discussed with @dvyukov at https://lwn.net/ml/all/CACT4Y+bmmYuCbV6g9yk8aFZdzhGhct3K78ii6voHR4KAa6oE0g@mail.gmail.com/, one way to deal with this would be to allow resetting the remote coverage under a lock. This is still somewhat wrong conceptually though, and may lead to subtle bugs if the owner thread also tries to modify its kcov instance at the same time.
Another possibility is to refactor syzkaller in a way that extra_cov
is always pinned to a separate worker thread which is responsible for that extra coverage.
It is still puzzling me how exactly extra coverage is supposed to work. As far as I can tell, write_extra_coverage()
is called upon completion of every worker thread, attributing portions of the extra coverage to individual syscalls. However, this coverage is coming from different background threads, so that attribution may be incorrect. If we actually do not care about that attribution, collecting the extra coverage for the whole program might be an easier solution.
Otherwise, if we want to preserve the attribution, we can still copy chunks of the extra coverage buffer in write_extra_coverage()
, and reset the whole buffer only once per program execution.