==================
american fuzzy lop
==================
Written and maintained by Michal Zalewski
Copyright 2013, 2014, 2015, 2016 Google Inc. All rights reserved.
Released under terms and conditions of Apache License, Version 2.0.
For new versions and additional information, check out:
http://lcamtuf.coredump.cx/afl/
To compare notes with other users or get notified about major new features,
send a mail to .
** See QuickStartGuide.txt if you don't have time to read this file. **
1) Challenges of guided fuzzing
-------------------------------
Fuzzing is one of the most powerful and proven strategies for identifying
security issues in real-world software; it is responsible for the vast
majority of remote code execution and privilege escalation bugs found to date
in security-critical software.
Unfortunately, fuzzing is also relatively shallow; blind, random mutations
make it very unlikely to reach certain code paths in the tested code, leaving
some vulnerabilities firmly outside the reach of this technique.
There have been numerous attempts to solve this problem. One of the early
approaches - pioneered by Tavis Ormandy - is corpus distillation. The method
relies on coverage signals to select a subset of interesting seeds from a
massive, high-quality corpus of candidate files (e.g., from a web crawl),
and then fuzz them by traditional means. The approach works exceptionally
well, but requires such a corpus to be readily available. In addition, block
coverage measurements provide only a very simplistic understanding of program
state, and are less useful for guiding the fuzzing effort in the long haul.
Other, more sophisticated research has focused on techniques such as program
flow analysis ("concolic execution"), symbolic execution, or static analysis.
All these methods are promising in experimental settings, but tend to suffer
from reliability and performance problems in practical uses - and currently
do not offer a viable alternative to "dumb" fuzzing techniques.
2) The afl-fuzz approach
------------------------
American Fuzzy Lop is a brute-force fuzzer coupled with an exceedingly simple
but rock-solid instrumentation-guided genetic algorithm. It uses a modified
form of edge coverage to effortlessly pick up subtle, local-scale changes to
program control flow as they occur in response to random modifications injected
into the input data. It then selects these new-coverage test cases as the seed
for subsequent fuzzing rounds.
Simplifying a bit, the overall algorithm can be summed up as:
1) Load user-supplied initial test cases into the queue,
2) Take next input file from the queue,
3) Attempt to trim the test case to the smallest size that doesn't alter
the measured behavior of the program,
4) Repeatedly mutate the file using a balanced and well-researched variety
of traditional fuzzing strategies,
5) If any of the generated mutations resulted in a new state transition
recorded by the instrumentation, add mutated output as a new entry in the
queue.
6) Go to 2.
The discovered test cases are also periodically culled to eliminate ones that
have been obsoleted by newer, higher-coverage finds; and undergo several other
instrumentation-driven effort minimization steps.
As a side result of the fuzzing process, the tool creates a small,
self-contained corpus of interesting test cases. These are extremely useful
for seeding other, labor- or resource-intensive testing regimes - for example,
for stress-testing browsers, office applications, graphics suites, or
closed-source tools.
The fuzzer is thoroughly tested to deliver out-of-the-box performance far
superior to blind fuzzing or coverage-only tools.
3) Instrumenting programs for use with AFL
------------------------------------------
When source code is available, instrumentation can be injected by a companion
tool that works as a drop-in replacement for gcc or clang in any standard build
process for third-party code.
The instrumentation has a fairly modest performance impact; in conjunction with
other optimizations implemented by afl-fuzz, most programs can be fuzzed as fast
or even faster than possible with traditional tools.
The correct way to recompile the target program may vary depending on the
specifics of the build process, but a nearly-universal approach would be:
$ CC=/path/to/afl/afl-gcc ./configure
$ make clean all
For C++ programs, you'd would also want to set CXX=/path/to/afl/afl-g++.
The clang wrappers (afl-clang and afl-clang++) can be used in the same way;
clang users may also opt to leverage a higher-performance instrumentation mode,
as described in llvm_mode/README.llvm.
When testing libraries, you need to find or write a simple program that reads
data from stdin or from a file and passes it to the tested library. In such a
case, it is essential to link this executable against a static version of the
instrumented library, or to make sure that the correct .so file is loaded at
runtime (usually by setting LD_LIBRARY_PATH). The simplest option is a static
build, usually possible via:
$ CC=/path/to/afl/afl-gcc ./configure --disable-shared
Setting AFL_HARDEN=1 when calling 'make' will cause the CC wrapper to
automatically enable code hardening options that make it easier to detect
simple memory bugs. Libdislocator, a helper library included with AFL (see
libdislocator/README.dislocator) can help uncover heap corruption issues, too.
PS. ASAN users are advised to review notes_for_asan.txt file for important
caveats.
4) Instrumenting binary-only apps
---------------------------------
When source code is *NOT* available, the fuzzer offers experimental support for
fast, on-the-fly instrumentation of black-box binaries. This is accomplished
with a version of QEMU running in the lesser-known "user space emulation" mode.
QEMU is a project separate from AFL, but you can conveniently build the
feature by doing:
$ cd qemu_mode
$ ./build_qemu_support.sh
For additional instructions and caveats, see qemu_mode/README.qemu.
The mode is approximately 2-5x slower than compile-time instrumentation, is
less conductive to parallelization, and may have some other quirks.
5) Choosing initial test cases
------------------------------
To operate correctly, the fuzzer requires one or more starting file that
contains a good example of the input data normally expected by the targeted
application. There are two basic rules:
- Keep the files small. Under 1 kB is ideal, although not strictly necessary.
For a discussion of why size matters, see perf_tips.txt.
- Use multiple test cases only if they are functionally different from
each other. There is no point in using fifty different vacation photos
to fuzz an image library.
You can find many good examples of starting files in the testcases/ subdirectory
that comes with this tool.
PS. If a large corpus of data is available for screening, you may want to use
the afl-cmin utility to identify a subset of functionally distinct files that
exercise different code paths in the target binary.
6) Fuzzing binaries
-------------------
The fuzzing process itself is carried out by the afl-fuzz utility. This program
requires a read-only directory with initial test cases, a separate place to
store its findings, plus a path to the binary to test.
For target binaries that accept input directly from stdin, the usual syntax is:
$ ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program [...params...]
For programs that take input from a file, use '@@' to mark the location in
the target's command line where the input file name should be placed. The
fuzzer will substitute this for you:
$ ./afl-fuzz -i testcase_dir -o findings_dir /path/to/program @@
You can also use the -f option to have the mutated data written to a specific
file. This is useful if the program expects a particular file extension or so.
Non-instrumented binaries can be fuzzed in the QEMU mode (add -Q in the command
line) or in a traditional, blind-fuzzer mode (specify -n).
You can use -t and -m to override the default timeout and memory limit for the
executed process; rare examples of targets that may need these settings touched
include compilers and video decoders.
Tips for optimizing fuzzing performance are discussed in perf_tips.txt.
Note that afl-fuzz starts by performing an array of deterministic fuzzing
steps, which can take several days, but tend to produce neat test cases. If you
want quick & dirty results right away - akin to zzuf and other traditional
fuzzers - add the -d option to the command line.
7) Interpreting output
----------------------
See the status_screen.txt file for information on how to interpret the
displayed stats and monitor the health of the process. Be sure to consult this
file especially if any UI elements are highlighted in red.
The fuzzing process will continue until you press Ctrl-C. At minimum, you want
to allow the fuzzer to complete one queue cycle, which may take anywhere from a
couple of hours to a week or so.
There are three subdirectories created within the output directory and updated
in real time:
- queue/ - test cases for every distinctive execution path, plus all the
starting files given by the user. This is the synthesized corpus
mentioned in section 2.
Before using this corpus for any other purposes, you can shrink
it to a smaller size using the afl-cmin tool. The tool will find
a smaller subset of files offering equivalent edge coverage.
- crashes/ - unique test cases that cause the tested program to receive a
fatal signal (e.g., SIGSEGV, SIGILL, SIGABRT). The entries are
grouped by the received signal.
- hangs/ - unique test cases that cause the tested program to time out. The
default time limit before something is classified as a hang is
the larger of 1 second and the value of the -t parameter.
The value can be fine-tuned by setting AFL_HANG_TMOUT, but this
is rarely necessary.
Crashes and hangs are considered "unique" if the associated execution paths
involve any state transitions not seen in previously-recorded faults. If a
single bug can be reached in multiple ways, there will be some count inflation
early in the process, but this should quickly taper off.
The file names for crashes and hangs are correlated with parent, non-faulting
queue entries. This should help with debugging.
When you can't reproduce a crash found by afl-fuzz, the most likely cause is
that you are not setting the same memory limit as used by the tool. Try:
$ LIMIT_MB=50
$ ( ulimit -Sv $[LIMIT_MB .
There is also a mailing list for the project; to join, send a mail to
. Or, if you prefer to browse
archives first, try:
https://groups.google.com/group/afl-users
PS. If you wish to submit raw code to be incorporated into the project, please
be aware that the copyright on most of AFL is claimed by Google. While you do
retain copyright on your contributions, they do ask people to agree to a simple
CLA first:
https://cla.developers.google.com/clas
Sorry about the hassle. Of course, no CLA is required for feature requests or
bug reports.