AFL++ Fuzzing Tutorial: Setup, Corpus Management, and Coverage-Guided Fuzzing
AFL++ (American Fuzzy Lop Plus Plus) is the most widely used coverage-guided fuzzer in the world. It has discovered thousands of vulnerabilities in production software — from image parsers to network protocols to cryptographic libraries. This tutorial walks you through setting up AFL++, instrumenting a target, building an effective corpus, and analyzing the crashes it finds.
What Makes AFL++ Different
AFL++ evolved from the original AFL (developed by Michal Zalewski at Google). It adds:
- Multiple instrumentation backends: LLVM, GCC, QEMU, Unicorn, Frida
- Custom mutators API: Write domain-specific mutation strategies in Python or C
- Persistent mode: Run target in-process without fork overhead (10-20x faster)
- Cmplog: Tracks comparison operands to break through magic byte checks
- MOpt: Machine learning-based mutation scheduling
- Parallel fuzzing: Scale across multiple cores with minimal configuration
AFL++ is the standard choice for fuzzing C and C++ code.
Installation
From Package Manager
# Ubuntu/Debian
apt-get install afl++
<span class="hljs-comment"># macOS (Homebrew)
brew install afl++
<span class="hljs-comment"># Arch Linux
pacman -S afl++Building from Source (Recommended)
Building from source ensures the latest features and best performance:
git clone https://github.com/AFLplusplus/AFLplusplus
<span class="hljs-built_in">cd AFLplusplus
make distrib
<span class="hljs-built_in">sudo make installVerify installation:
afl-fuzz --help 2>&1 <span class="hljs-pipe">| <span class="hljs-built_in">head -5Instrumenting a Target
AFL++ needs to instrument your target program to observe code coverage. The simplest approach uses the AFL++ compiler wrappers.
Compiling with afl-clang-fast
# For a simple C program
CC=afl-clang-fast CXX=afl-clang-fast++ ./configure --prefix=/tmp/fuzz-install
make -j$(<span class="hljs-built_in">nproc)
make install
<span class="hljs-comment"># Or directly
afl-clang-fast -o target target.cafl-clang-fast is the recommended instrumentation method — it uses LLVM's pass-based instrumentation, which is more precise than AFL's original GCC-based approach.
For programs using CMake:
cmake -DCMAKE_C_COMPILER=afl-clang-fast \
-DCMAKE_CXX_COMPILER=afl-clang-fast++ \
..
make -j$(nproc)A Simple Target for Learning
Let's create a target with a known vulnerability to demonstrate AFL++:
// target.c — intentionally buggy parser
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef struct {
char header[4];
int size;
char data[64];
} Packet;
void parse_packet(const unsigned char *buf, size_t len) {
if (len < 8) return;
Packet pkt;
memcpy(pkt.header, buf, 4);
if (memcmp(pkt.header, "PKT!", 4) != 0) return;
pkt.size = *(int *)(buf + 4);
// BUG: pkt.size not bounds-checked before memcpy!
memcpy(pkt.data, buf + 8, pkt.size); // Heap buffer overflow
printf("Packet size: %d\n", pkt.size);
}
int main(int argc, char **argv) {
if (argc < 2) {
fprintf(stderr, "Usage: %s <input_file>\n", argv[0]);
return 1;
}
FILE *f = fopen(argv[1], "rb");
if (!f) return 1;
unsigned char buf[1024];
size_t len = fread(buf, 1, sizeof(buf), f);
fclose(f);
parse_packet(buf, len);
return 0;
}Compile with AFL++ and AddressSanitizer (ASAN):
AFL_USE_ASAN=1 afl-clang-fast -fsanitize=address -o target_fuzz target.cASAN makes buffer overflows and use-after-free bugs crash immediately instead of silently corrupting memory. Always use it during fuzzing.
Building a Seed Corpus
AFL++ starts with a corpus of seed inputs and mutates them. A good corpus:
- Contains valid inputs your parser accepts
- Covers different code paths
- Is small in size (AFL++ prefers many small files over few large ones)
Creating Initial Seeds
mkdir -p corpus_in
<span class="hljs-comment"># Seed 1: minimal valid packet
<span class="hljs-built_in">printf <span class="hljs-string">'PKT!\x04\x00\x00\x00data' > corpus_in/valid_small
<span class="hljs-comment"># Seed 2: packet with different data
<span class="hljs-built_in">printf <span class="hljs-string">'PKT!\x08\x00\x00\x00datadata' > corpus_in/valid_medium
<span class="hljs-comment"># Seed 3: invalid header (will fail validation, but still a useful path)
<span class="hljs-built_in">printf <span class="hljs-string">'NOPE\x04\x00\x00\x00data' > corpus_in/invalid_headerFor real targets, collect actual input samples from:
- Protocol captures (Wireshark pcap → individual packets)
- File format samples (valid JPEG files for an image parser)
- Test fixtures from the project's test suite
- Publicly available corpus collections
Minimizing the Corpus
Duplicate inputs waste fuzzer time. Use afl-cmin to reduce your corpus to unique coverage:
afl-cmin -i corpus_in/ -o corpus_min/ -- ./target_fuzz @@@@ tells AFL++ where to pass the input file path. afl-cmin runs all inputs, discards duplicates with identical coverage, and outputs only the unique ones.
Running AFL++
Basic Single-Core Fuzzing
afl-fuzz -i corpus_min/ -o fuzz_output/ -- ./target_fuzz @@You'll see the AFL++ TUI — a dashboard showing:
- cycles done: How many times the corpus has been fully processed
- corpus count: Number of test cases in the corpus
- map coverage: % of coverage bitmap filled
- crashes: Number of unique crash inputs found
- hangs: Number of timeout inputs
Important Flags
# -t: timeout per execution (milliseconds, default 1000ms)
afl-fuzz -i corpus/ -o out/ -t 500 -- ./target @@
<span class="hljs-comment"># -m: memory limit (default 50MB)
afl-fuzz -i corpus/ -o out/ -m 200 -- ./target @@
<span class="hljs-comment"># -x: dictionary file with interesting tokens
afl-fuzz -i corpus/ -o out/ -x /usr/share/aflplusplus/dictionaries/http.dict -- ./target @@
<span class="hljs-comment"># -s: fixed random seed (for reproducibility)
afl-fuzz -i corpus/ -o out/ -s 42 -- ./target @@Parallel Fuzzing (Multi-Core)
Parallel fuzzing multiplies throughput linearly:
# Main instance (-M flag)
afl-fuzz -i corpus/ -o fuzz_output/ -M main -- ./target @@
<span class="hljs-comment"># Secondary instances (-S flag, run in separate terminals)
afl-fuzz -i corpus/ -o fuzz_output/ -S worker1 -- ./target @@
afl-fuzz -i corpus/ -o fuzz_output/ -S worker2 -- ./target @@
afl-fuzz -i corpus/ -o fuzz_output/ -S worker3 -- ./target @@Secondary instances share discoveries with the main instance. Run as many as you have CPU cores.
A quick launch script for all cores:
#!/bin/bash
CORES=$(<span class="hljs-built_in">nproc)
<span class="hljs-comment"># Launch main instance
afl-fuzz -i corpus/ -o fuzz_output/ -M main -- ./target @@ &
<span class="hljs-comment"># Launch secondary instances
<span class="hljs-keyword">for i <span class="hljs-keyword">in $(<span class="hljs-built_in">seq 1 $((CORES - <span class="hljs-number">1))); <span class="hljs-keyword">do
afl-fuzz -i corpus/ -o fuzz_output/ -S <span class="hljs-string">"worker${i}" -- ./target @@ &
<span class="hljs-keyword">done
<span class="hljs-built_in">waitPersistent Mode (10-20x Faster)
Persistent mode runs the fuzz target in a loop within a single process, eliminating fork() overhead:
// target_persistent.c
#include "afl-fuzz.h" // AFL++ header
int main(int argc, char **argv) {
// AFL_LOOP: run fuzz target in-process up to 10000 iterations
while (__AFL_LOOP(10000)) {
// Read from stdin (AFL++ provides input via stdin in persistent mode)
unsigned char buf[4096];
ssize_t len = read(0, buf, sizeof(buf));
if (len < 0) continue;
parse_packet(buf, len);
}
return 0;
}# Run in persistent mode (stdin input)
afl-fuzz -i corpus/ -o out/ -- ./target_persistentUsing Cmplog to Break Magic Bytes
Many programs have "magic byte" checks that fuzzers struggle with:
if (memcmp(buf, "MAGIC_HEADER\x42\x13", 14) != 0) return;Randomly mutating inputs to match 14 specific bytes is nearly impossible. AFL++'s Cmplog tracks comparison operands and automatically feeds them back as mutations:
# Compile with Cmplog instrumentation
AFL_LLVM_CMPLOG=1 afl-clang-fast -o target_cmplog target.c
<span class="hljs-comment"># Run with Cmplog
afl-fuzz -i corpus/ -o out/ -c ./target_cmplog -- ./target_fuzz @@Analyzing Crashes
AFL++ saves crash inputs in fuzz_output/main/crashes/. Each file is an input that caused a crash.
Reproducing a Crash
# Run the target directly with the crash input
./target_fuzz fuzz_output/main/crashes/id:000000,sig:11,src:000001,...
<span class="hljs-comment"># With ASAN for detailed analysis
ASAN_OPTIONS=symbolize=1 ./target_fuzz fuzz_output/main/crashes/id:000000,...ASAN output for our buffer overflow bug:
==12345==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x...
READ of size 1024 at 0x... thread T0
#0 0x... in parse_packet target.c:18
#1 0x... in main target.c:32
SUMMARY: AddressSanitizer: heap-buffer-overflow target.c:18 in parse_packetMinimizing Crash Inputs
Crash inputs from AFL++ are often large with unnecessary bytes. afl-tmin minimizes them:
afl-tmin -i fuzz_output/main/crashes/id:000000,... -o crash_minimal.bin -- ./target_fuzz @@The minimized input is easier to understand and turn into a regression test.
Deduplicating Crashes
Multiple crash inputs often represent the same underlying bug:
# Use afl-collect to gather and deduplicate
afl-collect -e -r fuzz_output/ crashes/ -- ./target_fuzz @@Reading AFL++ Stats
Check campaign statistics without the TUI:
cat fuzz_output/main/fuzzer_statsKey metrics:
run_time : 3600 (seconds)
cycles_done : 47 (corpus cycles completed)
corpus_count : 1847 (unique test cases)
map_coverage : 23.45% (% of edges covered)
crashes : 3 (unique crashes)
hangs : 0
execs_per_sec : 2847.32 (target executions/second)Using Dictionaries
Dictionaries tell AFL++ interesting byte sequences to use in mutations:
# AFL++ ships dictionaries for common formats
<span class="hljs-built_in">ls /usr/share/aflplusplus/dictionaries/
<span class="hljs-comment"># http.dict html.dict json.dict xml.dict jpeg.dict zip.dict ...
afl-fuzz -i corpus/ -o out/ -x /usr/share/aflplusplus/dictionaries/json.dict -- ./target @@Create custom dictionaries for domain-specific targets:
# custom.dict
<span class="hljs-string">"PKT!"
<span class="hljs-string">"MAGIC_BYTES"
<span class="hljs-string">"\x00\x01\x02\x03"
<span class="hljs-string">"\xff\xfe\xfd\xfc"Continuous Fuzzing in CI
Running AFL++ in CI catches regressions and extends coverage over time:
# GitHub Actions
- name: Run AFL++ fuzz campaign
run: |
afl-fuzz -i corpus/ -o fuzz_out/ -t 1000 -m 200 \
-- ./target @@ &
AFL_PID=$!
sleep 300 # 5 minutes
kill $AFL_PID
# Fail CI if crashes found
if ls fuzz_out/main/crashes/id:* 2>/dev/null; then
echo "CRASHES FOUND"
exit 1
fiFrom Fuzzing to Continuous Monitoring
AFL++ finds crashes in your code under controlled conditions. Production, however, has real users, real data, and runtime conditions a fuzzer can't simulate. A crash-free fuzzing run doesn't mean your application works correctly in production.
HelpMeTest complements AFL++ by running continuous end-to-end tests against your live production environment — verifying correct behavior 24/7. While AFL++ explores edge cases in unit-level code, HelpMeTest monitors whether the full application works for real users. No code required.
Summary
AFL++ is production-grade fuzzing infrastructure with a manageable learning curve. The key steps:
- Compile with
afl-clang-fastandAFL_USE_ASAN=1 - Build a minimal seed corpus from real inputs
- Minimize corpus with
afl-cmin - Run with parallel workers (one per CPU core)
- Use Cmplog (
-c) to break through magic byte checks - Analyze crashes with ASAN symbolization
- Minimize crash inputs with
afl-tmin - Add crash inputs as regression tests
Give AFL++ a few hours on any C/C++ parser that handles untrusted input. The bugs it finds will surprise you.