This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Add fuzzing coverage support
- From: Dmitry Vyukov <dvyukov at google dot com>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>, Jakub Jelinek <jakub at redhat dot com>, Yuri Gribov <tetra2005 at gmail dot com>
- Cc: Kostya Serebryany <kcc at google dot com>, Alexander Potapenko <glider at google dot com>, Andrey Ryabinin <ryabinin dot a dot a at gmail dot com>, Jiri Slaby <jslaby at suse dot cz>, Quentin Casasnovas <quentin dot casasnovas at oracle dot com>, Sasha Levin <sasha dot levin at oracle dot com>, syzkaller <syzkaller at googlegroups dot com>
- Date: Wed, 2 Dec 2015 17:10:00 +0100
- Subject: Re: Add fuzzing coverage support
- Authentication-results: sourceware.org; auth=none
- References: <CACT4Y+ZW4oRSD0TpEubnZjHmLsgVy_A4MQvUxZHWYgxxj_CCcA at mail dot gmail dot com> <CACT4Y+YB5N-wi7U=HiLJgpBMt5fnZ1F5k4mKUy1TY-fd_ji81Q at mail dot gmail dot com>
ping
Number of bugs found with this coverage in kernel already crossed 40:
https://github.com/google/syzkaller/wiki/Found-Bugs
On Fri, Nov 27, 2015 at 3:30 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> +syzkaller group
>
> On Fri, Nov 27, 2015 at 3:28 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> Hello,
>>
>> This patch adds support for coverage-guided fuzzing:
>> https://codereview.appspot.com/280140043
>>
>> Coverage-guided fuzzing is a powerful randomized testing technique
>> that uses coverage feedback to determine new interesting inputs to a
>> program. Some examples of the systems that use it are AFL
>> (http://lcamtuf.coredump.cx/afl/) and LibFuzzer
>> (http://llvm.org/docs/LibFuzzer.html). Compiler coverage
>> instrumentation allows to make fuzzing more efficient as compared to
>> binary instrumentation.
>>
>> Flag that enables coverage is named -fsanitize-coverage=trace-pc to
>> make it consistent with similar functionality in clang, which already
>> supports a set of fuzzing coverage modes:
>> http://clang.llvm.org/docs/SanitizerCoverage.html
>> -fsanitize-coverage=trace-pc is not yet supported in clang, but we
>> plan to add it.
>>
>> This particular coverage mode simply inserts function calls into every
>> basic block.
>> I've built syzkaller, a Linux system call fuzzer
>> (https://github.com/google/syzkaller), using this functionality. The
>> fuzzer has found 30+ previously unknown bugs in kernel
>> (https://github.com/google/syzkaller/wiki/Found-Bugs) in slightly more
>> than a month (while kernel was extensively fuzzed with trinity --
>> non-guided fuzzer). Quentin also built some kernel fuzzer on top of
>> it.
>>
>> Why not gcov. Typical fuzzing loop looks as follows: (1) reset
>> coverage, (2) execute a bit of code, (3) collect coverage, repeat. A
>> typical coverage can be just a dozen of basic blocks (e.g. invalid
>> input). In such context gcov becomes prohibitively expensive as
>> reset/collect coverage steps depend on total number of basic
>> blocks/edges in program (in case of kernel it is about 2M). Cost of
>> this "tracing" mode depends on number of executed basic blocks/edges.
>> On top of that, kernel required per-thread coverage because there are
>> always background threads and unrelated processes that also produce
>> coverage. With inlined gcov instrumentation per-thread coverage is not
>> possible.
>> Inlined fast paths do not make lots of sense in fuzzing scenario,
>> because lots of code is executed just once in between resets. Also it
>> is not really possible to inline accesses to per-cpu and per-task data
>> structures for kernel.
>>
>> OK for trunk?