Fuzz Testing

AFL++ Fuzzing Tutorial: Setup, Corpus Management, and Coverage-Guided Fuzzing

HelpMeTest

16 May 2026 — 7 min read

AFL++ (American Fuzzy Lop Plus Plus) is the most widely used coverage-guided fuzzer in the world. It has discovered thousands of vulnerabilities in production software — from image parsers to network protocols to cryptographic libraries. This tutorial walks you through setting up AFL++, instrumenting a target, building an effective corpus, and analyzing the crashes it finds.

What Makes AFL++ Different

AFL++ evolved from the original AFL (developed by Michal Zalewski at Google). It adds:

Multiple instrumentation backends: LLVM, GCC, QEMU, Unicorn, Frida
Custom mutators API: Write domain-specific mutation strategies in Python or C
Persistent mode: Run target in-process without fork overhead (10-20x faster)
Cmplog: Tracks comparison operands to break through magic byte checks
MOpt: Machine learning-based mutation scheduling
Parallel fuzzing: Scale across multiple cores with minimal configuration

AFL++ is the standard choice for fuzzing C and C++ code.

Installation

From Package Manager

# Ubuntu/Debian
apt-get install afl++

<span class="hljs-comment"># macOS (Homebrew)
brew install afl++

<span class="hljs-comment"># Arch Linux
pacman -S afl++

Building from Source (Recommended)

Building from source ensures the latest features and best performance:

git clone https://github.com/AFLplusplus/AFLplusplus
<span class="hljs-built_in">cd AFLplusplus
make distrib
<span class="hljs-built_in">sudo make install

Verify installation:

afl-fuzz --help 2>&1 <span class="hljs-pipe">| <span class="hljs-built_in">head -5

Instrumenting a Target

AFL++ needs to instrument your target program to observe code coverage. The simplest approach uses the AFL++ compiler wrappers.

Compiling with afl-clang-fast

# For a simple C program
CC=afl-clang-fast CXX=afl-clang-fast++ ./configure --prefix=/tmp/fuzz-install
make -j$(<span class="hljs-built_in">nproc)
make install

<span class="hljs-comment"># Or directly
afl-clang-fast -o target target.c

afl-clang-fast is the recommended instrumentation method — it uses LLVM's pass-based instrumentation, which is more precise than AFL's original GCC-based approach.

For programs using CMake:

cmake -DCMAKE_C_COMPILER=afl-clang-fast \
      -DCMAKE_CXX_COMPILER=afl-clang-fast++ \
      ..
make -j$(nproc)

A Simple Target for Learning

Let's create a target with a known vulnerability to demonstrate AFL++:

// target.c — intentionally buggy parser
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

typedef struct {
    char header[4];
    int size;
    char data[64];
} Packet;

void parse_packet(const unsigned char *buf, size_t len) {
    if (len < 8) return;
    
    Packet pkt;
    memcpy(pkt.header, buf, 4);
    
    if (memcmp(pkt.header, "PKT!", 4) != 0) return;
    
    pkt.size = *(int *)(buf + 4);
    
    // BUG: pkt.size not bounds-checked before memcpy!
    memcpy(pkt.data, buf + 8, pkt.size);  // Heap buffer overflow
    
    printf("Packet size: %d\n", pkt.size);
}

int main(int argc, char **argv) {
    if (argc < 2) {
        fprintf(stderr, "Usage: %s <input_file>\n", argv[0]);
        return 1;
    }
    
    FILE *f = fopen(argv[1], "rb");
    if (!f) return 1;
    
    unsigned char buf[1024];
    size_t len = fread(buf, 1, sizeof(buf), f);
    fclose(f);
    
    parse_packet(buf, len);
    return 0;
}

Compile with AFL++ and AddressSanitizer (ASAN):

AFL_USE_ASAN=1 afl-clang-fast -fsanitize=address -o target_fuzz target.c

ASAN makes buffer overflows and use-after-free bugs crash immediately instead of silently corrupting memory. Always use it during fuzzing.

Building a Seed Corpus

AFL++ starts with a corpus of seed inputs and mutates them. A good corpus:

Contains valid inputs your parser accepts
Covers different code paths
Is small in size (AFL++ prefers many small files over few large ones)

Creating Initial Seeds

mkdir -p corpus_in

<span class="hljs-comment"># Seed 1: minimal valid packet
<span class="hljs-built_in">printf <span class="hljs-string">'PKT!\x04\x00\x00\x00data' > corpus_in/valid_small

<span class="hljs-comment"># Seed 2: packet with different data
<span class="hljs-built_in">printf <span class="hljs-string">'PKT!\x08\x00\x00\x00datadata' > corpus_in/valid_medium

<span class="hljs-comment"># Seed 3: invalid header (will fail validation, but still a useful path)
<span class="hljs-built_in">printf <span class="hljs-string">'NOPE\x04\x00\x00\x00data' > corpus_in/invalid_header

For real targets, collect actual input samples from:

Protocol captures (Wireshark pcap → individual packets)
File format samples (valid JPEG files for an image parser)
Test fixtures from the project's test suite
Publicly available corpus collections

Minimizing the Corpus

Duplicate inputs waste fuzzer time. Use afl-cmin to reduce your corpus to unique coverage:

afl-cmin -i corpus_in/ -o corpus_min/ -- ./target_fuzz @@

@@ tells AFL++ where to pass the input file path. afl-cmin runs all inputs, discards duplicates with identical coverage, and outputs only the unique ones.

Running AFL++

Basic Single-Core Fuzzing

afl-fuzz -i corpus_min/ -o fuzz_output/ -- ./target_fuzz @@

You'll see the AFL++ TUI — a dashboard showing:

cycles done: How many times the corpus has been fully processed
corpus count: Number of test cases in the corpus
map coverage: % of coverage bitmap filled
crashes: Number of unique crash inputs found
hangs: Number of timeout inputs

Important Flags

# -t: timeout per execution (milliseconds, default 1000ms)
afl-fuzz -i corpus/ -o out/ -t 500 -- ./target @@

<span class="hljs-comment"># -m: memory limit (default 50MB)
afl-fuzz -i corpus/ -o out/ -m 200 -- ./target @@

<span class="hljs-comment"># -x: dictionary file with interesting tokens
afl-fuzz -i corpus/ -o out/ -x /usr/share/aflplusplus/dictionaries/http.dict -- ./target @@

<span class="hljs-comment"># -s: fixed random seed (for reproducibility)
afl-fuzz -i corpus/ -o out/ -s 42 -- ./target @@

Parallel Fuzzing (Multi-Core)

Parallel fuzzing multiplies throughput linearly:

# Main instance (-M flag)
afl-fuzz -i corpus/ -o fuzz_output/ -M main -- ./target @@

<span class="hljs-comment"># Secondary instances (-S flag, run in separate terminals)
afl-fuzz -i corpus/ -o fuzz_output/ -S worker1 -- ./target @@
afl-fuzz -i corpus/ -o fuzz_output/ -S worker2 -- ./target @@
afl-fuzz -i corpus/ -o fuzz_output/ -S worker3 -- ./target @@

Secondary instances share discoveries with the main instance. Run as many as you have CPU cores.

A quick launch script for all cores:

#!/bin/bash
CORES=$(<span class="hljs-built_in">nproc)

<span class="hljs-comment"># Launch main instance
afl-fuzz -i corpus/ -o fuzz_output/ -M main -- ./target @@ &

<span class="hljs-comment"># Launch secondary instances
<span class="hljs-keyword">for i <span class="hljs-keyword">in $(<span class="hljs-built_in">seq 1 $((CORES - <span class="hljs-number">1))); <span class="hljs-keyword">do
    afl-fuzz -i corpus/ -o fuzz_output/ -S <span class="hljs-string">"worker${i}" -- ./target @@ &
<span class="hljs-keyword">done

<span class="hljs-built_in">wait

Persistent Mode (10-20x Faster)

Persistent mode runs the fuzz target in a loop within a single process, eliminating fork() overhead:

// target_persistent.c
#include "afl-fuzz.h"  // AFL++ header

int main(int argc, char **argv) {
    // AFL_LOOP: run fuzz target in-process up to 10000 iterations
    while (__AFL_LOOP(10000)) {
        // Read from stdin (AFL++ provides input via stdin in persistent mode)
        unsigned char buf[4096];
        ssize_t len = read(0, buf, sizeof(buf));
        if (len < 0) continue;
        
        parse_packet(buf, len);
    }
    return 0;
}

# Run in persistent mode (stdin input)
afl-fuzz -i corpus/ -o out/ -- ./target_persistent

Using Cmplog to Break Magic Bytes

Many programs have "magic byte" checks that fuzzers struggle with:

if (memcmp(buf, "MAGIC_HEADER\x42\x13", 14) != 0) return;

Randomly mutating inputs to match 14 specific bytes is nearly impossible. AFL++'s Cmplog tracks comparison operands and automatically feeds them back as mutations:

# Compile with Cmplog instrumentation
AFL_LLVM_CMPLOG=1 afl-clang-fast -o target_cmplog target.c

<span class="hljs-comment"># Run with Cmplog
afl-fuzz -i corpus/ -o out/ -c ./target_cmplog -- ./target_fuzz @@

Analyzing Crashes

AFL++ saves crash inputs in fuzz_output/main/crashes/. Each file is an input that caused a crash.

Reproducing a Crash

# Run the target directly with the crash input
./target_fuzz fuzz_output/main/crashes/id:000000,sig:11,src:000001,...

<span class="hljs-comment"># With ASAN for detailed analysis
ASAN_OPTIONS=symbolize=1 ./target_fuzz fuzz_output/main/crashes/id:000000,...

ASAN output for our buffer overflow bug:

==12345==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x...
READ of size 1024 at 0x... thread T0
    #0 0x... in parse_packet target.c:18
    #1 0x... in main target.c:32
    
SUMMARY: AddressSanitizer: heap-buffer-overflow target.c:18 in parse_packet

Minimizing Crash Inputs

Crash inputs from AFL++ are often large with unnecessary bytes. afl-tmin minimizes them:

afl-tmin -i fuzz_output/main/crashes/id:000000,... -o crash_minimal.bin -- ./target_fuzz @@

The minimized input is easier to understand and turn into a regression test.

Deduplicating Crashes

Multiple crash inputs often represent the same underlying bug:

# Use afl-collect to gather and deduplicate
afl-collect -e -r fuzz_output/ crashes/ -- ./target_fuzz @@

Reading AFL++ Stats

Check campaign statistics without the TUI:

cat fuzz_output/main/fuzzer_stats

Key metrics:

run_time        : 3600         (seconds)
cycles_done     : 47           (corpus cycles completed)  
corpus_count    : 1847         (unique test cases)
map_coverage    : 23.45%       (% of edges covered)
crashes         : 3            (unique crashes)
hangs           : 0
execs_per_sec   : 2847.32      (target executions/second)

Using Dictionaries

Dictionaries tell AFL++ interesting byte sequences to use in mutations:

# AFL++ ships dictionaries for common formats
<span class="hljs-built_in">ls /usr/share/aflplusplus/dictionaries/
<span class="hljs-comment"># http.dict html.dict json.dict xml.dict jpeg.dict zip.dict ...

afl-fuzz -i corpus/ -o out/ -x /usr/share/aflplusplus/dictionaries/json.dict -- ./target @@

Create custom dictionaries for domain-specific targets:

# custom.dict
<span class="hljs-string">"PKT!"
<span class="hljs-string">"MAGIC_BYTES"
<span class="hljs-string">"\x00\x01\x02\x03"
<span class="hljs-string">"\xff\xfe\xfd\xfc"

Continuous Fuzzing in CI

Running AFL++ in CI catches regressions and extends coverage over time:

# GitHub Actions
- name: Run AFL++ fuzz campaign
  run: |
    afl-fuzz -i corpus/ -o fuzz_out/ -t 1000 -m 200 \
      -- ./target @@ &
    AFL_PID=$!
    sleep 300  # 5 minutes
    kill $AFL_PID
    
    # Fail CI if crashes found
    if ls fuzz_out/main/crashes/id:* 2>/dev/null; then
      echo "CRASHES FOUND"
      exit 1
    fi

From Fuzzing to Continuous Monitoring

AFL++ finds crashes in your code under controlled conditions. Production, however, has real users, real data, and runtime conditions a fuzzer can't simulate. A crash-free fuzzing run doesn't mean your application works correctly in production.

HelpMeTest complements AFL++ by running continuous end-to-end tests against your live production environment — verifying correct behavior 24/7. While AFL++ explores edge cases in unit-level code, HelpMeTest monitors whether the full application works for real users. No code required.

Summary

AFL++ is production-grade fuzzing infrastructure with a manageable learning curve. The key steps:

Compile with afl-clang-fast and AFL_USE_ASAN=1
Build a minimal seed corpus from real inputs
Minimize corpus with afl-cmin
Run with parallel workers (one per CPU core)
Use Cmplog (-c) to break through magic byte checks
Analyze crashes with ASAN symbolization
Minimize crash inputs with afl-tmin
Add crash inputs as regression tests

Give AFL++ a few hours on any C/C++ parser that handles untrusted input. The bugs it finds will surprise you.