libFuzzer Guide: In-Process Fuzzing with LLVM
libFuzzer is LLVM's in-process, coverage-guided fuzzing engine. Unlike AFL which forks a new process per input, libFuzzer runs inside the target process. This eliminates process spawn overhead and can achieve orders of magnitude more executions per second for code with fast parsing paths.
libFuzzer is built into LLVM/Clang and requires no separate installation.
How libFuzzer Differs from AFL
| libFuzzer | AFL | |
|---|---|---|
| Execution model | In-process | New process per input |
| Speed | Faster (no fork) | Slower (fork overhead) |
| Stability | Less (state leaks) | More (clean process each time) |
| Setup | Requires harness function | Works with any binary |
| Integration | Library-based | External tool |
| Best for | Libraries, fast targets | Programs with complex state |
Use libFuzzer when your target is a library with a clean API. Use AFL when your target is a standalone binary or when state isolation is important.
Writing a libFuzzer Harness
Every libFuzzer target implements one function:
// fuzz_target.c
#include <stdint.h>
#include <stdlib.h>
// Your library's header
#include "myparser.h"
// libFuzzer calls this with each generated input
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
// Parse/process the data
// Return 0 — any other return value is reserved
parse_input(data, size);
return 0;
}Rules for the harness:
- Always return 0 (other values may be used for special purposes in future versions)
- Do not call
exit()— it terminates the fuzzer - Do not print to stdout/stderr in the main loop (it slows things down significantly)
- If the function allocates memory, free it — libFuzzer runs millions of iterations in the same process
Compiling and Running
# Compile with libFuzzer and AddressSanitizer
clang -fsanitize=fuzzer,address -o fuzz_target fuzz_target.c mylib.c
<span class="hljs-comment"># Run the fuzzer
<span class="hljs-built_in">mkdir corpus
./fuzz_target corpus/
<span class="hljs-comment"># Run with a seed corpus
<span class="hljs-built_in">mkdir seeds
<span class="hljs-built_in">echo <span class="hljs-string">"valid input" > seeds/seed1
./fuzz_target corpus/ seeds/
<span class="hljs-comment"># Run for a time limit (seconds)
./fuzz_target corpus/ -max_total_time=3600
<span class="hljs-comment"># Run with multiple jobs (parallel)
./fuzz_target corpus/ -<span class="hljs-built_in">jobs=4 -workers=4Sanitizer Integration
Always compile with sanitizers. libFuzzer without sanitizers misses most bugs.
# AddressSanitizer (memory errors)
clang -fsanitize=fuzzer,address -o fuzz_target fuzz_target.c mylib.c
<span class="hljs-comment"># UndefinedBehaviorSanitizer
clang -fsanitize=fuzzer,undefined -o fuzz_target fuzz_target.c mylib.c
<span class="hljs-comment"># Both (recommended)
clang -fsanitize=fuzzer,address,undefined -o fuzz_target fuzz_target.c mylib.c
<span class="hljs-comment"># MemorySanitizer (uninitialized reads — requires special build)
clang -fsanitize=fuzzer,memory -o fuzz_target fuzz_target.c mylib.cWhat each sanitizer catches:
- ASan: Buffer overflows, use-after-free, double-free, heap/stack/global buffer overflows
- UBSan: Integer overflow, null pointer dereference, misaligned access, invalid enum values
- MSan: Use of uninitialized memory (can catch information leaks)
FuzzedDataProvider: Structured Fuzzing
Raw bytes work for binary parsers. For structured input (multiple fields, integers, strings), use FuzzedDataProvider:
// fuzz_structured.cpp
#include <fuzzer/FuzzedDataProvider.h>
#include "myapi.h"
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
FuzzedDataProvider fdp(data, size);
// Extract structured data from the fuzzer input
int user_id = fdp.ConsumeIntegral<int>();
bool is_admin = fdp.ConsumeBool();
std::string username = fdp.ConsumeRandomLengthString(64);
std::vector<uint8_t> payload = fdp.ConsumeRemainingBytes<uint8_t>();
// Use structured data in your API
User user(user_id, username, is_admin);
process_request(user, payload.data(), payload.size());
return 0;
}FuzzedDataProvider methods:
ConsumeIntegral<T>()— consume an integer of type TConsumeBool()— consume a booleanConsumeFloatingPoint<T>()— consume a float/doubleConsumeRandomLengthString(max_length)— consume a stringConsumeBytes<T>(count)— consume N bytes as a vectorConsumeRemainingBytes<T>()— consume all remaining bytesPickValueInArray(array)— pick a value from an arrayConsumeProbability<T>()— value in [0.0, 1.0]
Corpus Management
# Merge new inputs into an existing corpus
./fuzz_target -merge=1 corpus/ new_inputs/
<span class="hljs-comment"># Minimize the corpus (remove redundant inputs)
./fuzz_target -merge=1 minimized_corpus/ corpus/
<span class="hljs-comment"># Run a single input (for debugging)
./fuzz_target crash_input
<span class="hljs-comment"># Print coverage map for a specific input
./fuzz_target -dump_coverage=1 corpus/seed1Understanding libFuzzer Output
INFO: Seed: 1234567890
INFO: Loaded 5 modules (40965 inline 8-bit counters): 5 [0x7f..., 0x7f...)
INFO: Loaded 5 PC tables (40965 PCs): 5 [0x7f..., 0x7f...),
INFO: 10 files found in corpus/
INFO: seed corpus: files: 10 min: 1b max: 1024b total: 4096b rss: 35Mb
#10 INITED cov: 428 ft: 1234 corp: 10/4096b exec/s: 0 rss: 36Mb
#512 NEW cov: 431 ft: 1246 corp: 11/4098b lim: 64 exec/s: 512 L: 2/1024 MS: 2 EraseBytes-InsertByte-
#1024 REDUCE cov: 431 ft: 1246 corp: 11/4096b lim: 64 exec/s: 1024 L: 2/1024 MS: 1 EraseBytes-
...Key fields:
cov: Number of code coverage edges triggeredft: Number of unique features (more granular than edges)corp: Number of inputs in corpus / total corpus sizeexec/s: Executions per secondL: Size of the current input / max corpus input sizeMS: Mutation stages applied
NEW lines mean the fuzzer found a new coverage path. REDUCE means it minimized an existing corpus entry. CRASH means it found a bug.
Reproducing and Minimizing Crashes
# The crashing input is printed on crash:
<span class="hljs-comment"># artifact_prefix='./'; Test unit written to ./crash-abc123
<span class="hljs-comment"># Reproduce
./fuzz_target crash-abc123
<span class="hljs-comment"># Minimize (find smallest input that still crashes)
./fuzz_target -minimize_crash=1 -max_total_time=60 crash-abc123Custom Mutators
For targets with complex input formats (protocols, structured configs), you can write a custom mutator:
extern "C" size_t LLVMFuzzerCustomMutator(
uint8_t *data, size_t size, size_t max_size, unsigned int seed) {
// Parse the input into your format
MyFormat fmt;
if (!fmt.Parse(data, size)) {
// If unparseable, return a valid minimal input
std::string minimal = "{}";
memcpy(data, minimal.data(), minimal.size());
return minimal.size();
}
// Apply format-aware mutations
std::mt19937 rng(seed);
fmt.MutateField(rng);
// Serialize back
std::string serialized = fmt.Serialize();
if (serialized.size() > max_size) return size; // too large, return unchanged
memcpy(data, serialized.data(), serialized.size());
return serialized.size();
}Custom mutators drastically improve coverage for structured inputs by generating valid-but-unexpected inputs rather than random byte sequences.
OSS-Fuzz Integration
If your project is open source, OSS-Fuzz will run your libFuzzer harnesses continuously for free.
Requirements:
- libFuzzer harnesses
- Docker-based build configuration
- Public repository
Minimal integration:
# projects/mylib/Dockerfile
FROM gcr.io/oss-fuzz-base/base-builder
COPY . $SRC/mylib
WORKDIR $SRC/mylib
COPY build.sh $SRC/# projects/mylib/build.sh
<span class="hljs-comment">#!/bin/bash -eu
<span class="hljs-built_in">cd <span class="hljs-variable">$SRC/mylib
cmake . -DCMAKE_CXX_COMPILER=<span class="hljs-variable">$CXX -DCMAKE_C_COMPILER=<span class="hljs-variable">$CC \
-DCMAKE_CXX_FLAGS=<span class="hljs-string">"$CXXFLAGS" -DCMAKE_C_FLAGS=<span class="hljs-string">"$CFLAGS"
make -j$(<span class="hljs-built_in">nproc)
<span class="hljs-comment"># Copy harness binaries
<span class="hljs-keyword">for fuzzer <span class="hljs-keyword">in fuzz_parser fuzz_decoder; <span class="hljs-keyword">do
<span class="hljs-built_in">cp <span class="hljs-variable">$fuzzer <span class="hljs-variable">$OUT/
<span class="hljs-keyword">done
<span class="hljs-comment"># Copy seed corpus
zip -r <span class="hljs-variable">$OUT/fuzz_parser_seed_corpus.zip corpus/# projects/mylib/project.yaml
homepage: "https://github.com/org/mylib"
language: c++
primary_contact: "maintainer@example.com"
auto_ccs:
- "security@example.com"Submit as a PR to the OSS-Fuzz repository.
Performance Optimization
Track expensive allocations: If your harness allocates a large object on every call, move it outside the fuzzing loop using a static or persistent initialization:
// Slow: allocates every call
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
MyExpensiveObject obj; // expensive construction
obj.Parse(data, size);
return 0;
}
// Fast: initialize once
static MyExpensiveObject *obj = nullptr;
extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv) {
obj = new MyExpensiveObject();
return 0;
}
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
obj->Reset(); // cheap reset
obj->Parse(data, size);
return 0;
}Reduce logging: Disable logging in test builds. Even stderr writes slow down fuzzing substantially.
Focus the harness: A harness that exercises 500 lines of critical parsing code will outperform one that exercises 10,000 lines of application logic. Narrow the attack surface to what you care about most.