Skip to content

executor: use KCOV_RESET_TRACE where applicable #6262

@ramosian-glider

Description

@ramosian-glider

As of today, syzkaller is able to detect the ReadOnlyCoverage feature, but it is still not enabled by default.

I think this should be done by setting:

diff --git a/pkg/fuzzer/fuzzer.go b/pkg/fuzzer/fuzzer.go
index 0c0119e71..ded7e9213 100644
--- a/pkg/fuzzer/fuzzer.go
+++ b/pkg/fuzzer/fuzzer.go
@@ -469,6 +469,7 @@ func DefaultExecOpts(cfg *mgrconfig.Config, features flatrpc.Feature, debug bool
                panic(fmt.Sprintf("failed to parse sandbox: %v", err))
        }
        env |= sandbox
+       env |= flatrpc.ExecEnvReadOnlyCoverage
 
        exec := flatrpc.ExecFlagThreaded
        if !cfg.RawCover {

But there is an important invariant that is violated if we do so: namely, KCOV_ENABLE and KCOV_REMOTE_ENABLE pin the kcov instance to the current thread, making it impossible to change it from other threads.

Currently there are three types of resets performed by the executor:

  1. Resetting the current thread's coverage in execute_call() at
    cover_reset(&th->cov);
    .
  2. Resetting another thread's coverage in schedule_call() at
    cover_reset(&th->cov);
    before scheduling a call on it.
  3. Resetting the extra coverage in execute_one() at
    cover_reset(&extra_cov);
    and write_extra_output()
    cover_reset(&extra_cov);

The first case is perfectly safe. The second one can be easily fixed by asking the target thread to reset the coverage before executing the call:

@@ -1164,8 +1169,9 @@ thread_t* schedule_call(int call_index, int call_num, uint64 copyout_index, uint
        // which overlaps with comparison type in kernel exposed records. As the result write_comparisons
        // that will try to write out data from unfinished syscalls will see these rpc::ComparisonRaw records,
        // mis-interpret PC as type, and fail as: SYZFAIL: invalid kcov comp type (type=ffffffff8100b4e0).
-       if (flag_coverage)
-               cover_reset(&th->cov);
+       if (flag_coverage) {
+               th->want_reset_cov = true;
+       }
        th->executing = true;
        th->call_index = call_index;
        th->call_num = call_num;
@@ -1487,6 +1494,10 @@ void* worker_thread(void* arg)
                // because in snapshot mode we don't know coverage mode for precreated threads.
                if (first && cover_collection_required())
                        cover_enable(&th->cov, flag_comparisons, false);
+               if (!first && th->want_reset_cov) {
+                       cover_reset(&th->cov);
+                       th->want_reset_cov = false;
+               }
                execute_call(th);
                event_set(&th->done);
        }

Now we are down to a single case of the worker thread (?) requesting a reset of extra_cov created by the main thread.

As discussed with @dvyukov at https://lwn.net/ml/all/CACT4Y+bmmYuCbV6g9yk8aFZdzhGhct3K78ii6voHR4KAa6oE0g@mail.gmail.com/, one way to deal with this would be to allow resetting the remote coverage under a lock. This is still somewhat wrong conceptually though, and may lead to subtle bugs if the owner thread also tries to modify its kcov instance at the same time.

Another possibility is to refactor syzkaller in a way that extra_cov is always pinned to a separate worker thread which is responsible for that extra coverage.

It is still puzzling me how exactly extra coverage is supposed to work. As far as I can tell, write_extra_coverage() is called upon completion of every worker thread, attributing portions of the extra coverage to individual syscalls. However, this coverage is coming from different background threads, so that attribution may be incorrect. If we actually do not care about that attribution, collecting the extra coverage for the whole program might be an easier solution.

Otherwise, if we want to preserve the attribution, we can still copy chunks of the extra coverage buffer in write_extra_coverage(), and reset the whole buffer only once per program execution.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions