Schemathesis vs Dredd: When to Use Which API Testing Tool
Schemathesis and Dredd both test APIs against their OpenAPI specifications. They're often mentioned in the same breath, which leads teams to think they're interchangeable. They're not. They find different bugs, in different ways, with different tradeoffs. Understanding the distinction helps you deploy the right tool — or both tools — for your situation.
The Core Difference
Dredd asks: "Does your implementation match your documentation?"
It reads your spec, makes the example requests from it, and checks whether responses match what the spec says. If your spec shows an example response with {"id": 1, "name": "Alice"}, Dredd calls the endpoint and verifies the response structure matches. It's conformance testing — verifying implementation against specification.
Schemathesis asks: "Does your implementation handle everything your spec says it can?"
It reads your spec and generates novel requests from it — using your schemas as blueprints for test case generation. It sends empty strings, maximum integers, unicode edge cases, and random-but-valid combinations, then checks whether your server handles each one correctly. It's property-based fuzzing — exploring the space of valid inputs to find unexpected failures.
Same input (your OpenAPI spec). Completely different outputs. Neither replaces the other.
What Dredd Finds
Spec drift: Your code changed but your docs didn't. The endpoint now returns status_code but the spec says status. Dredd catches this immediately.
Missing required fields: Your spec says email is required in responses. Your code sometimes omits it. Dredd fails on that inconsistency.
Wrong status codes: Your spec says GET /users/{id} returns 404 when the user doesn't exist. Your code returns 200 with {"user": null}. Dredd fails.
Type mismatches: Your spec says count is an integer. Your code returns it as a string. Dredd catches this.
Breaking changes in disguise: A well-intentioned refactor makes the response format "cleaner" but breaks the documented structure. Dredd fails before it ships.
What Schemathesis Finds
Server errors from valid edge cases: Your API returns 200 for normal strings but crashes (500) when the string is empty, or very long, or contains a null byte. Normal test suites don't send these. Schemathesis does.
Unhandled valid inputs: Your spec says a field accepts integers. When you send the maximum integer value (2,147,483,647), the code throws an overflow exception. Valid input, unexpected behavior.
Schema violations from edge cases: The API returns valid responses normally but violates the schema when inputs are unusual. Only discovered when you actually send unusual inputs.
Silent data corruption: For certain inputs, the API returns 200 but the response data is malformed in a way that matches the schema structurally but contains wrong values. Schemathesis with custom checks can catch this.
Where They're Blind
Dredd's blind spots:
- It only tests what's in your spec's examples. If your examples show
id: 1and that user exists in your test database, Dredd passes. It doesn't tryid: 99999orid: -1. - It doesn't find state management bugs or multi-step failures.
- It can't catch security issues — it follows your spec exactly.
Schemathesis's blind spots:
- It can't tell you when your implementation has drifted from your spec — it tests the implementation against the spec, but it uses generated values, not example values. If your response always matches the schema, Schemathesis passes even if the actual values are wrong.
- It doesn't understand business logic. Your spec says a discount code field accepts strings — Schemathesis tests whether strings work, but won't know if the discount codes it generates are semantically invalid.
- Limited stateful reasoning — by default it tests each endpoint independently.
A Concrete Example
Consider this spec:
paths:
/users/{id}:
get:
parameters:
- name: id
in: path
schema:
type: integer
responses:
"200":
description: User
content:
application/json:
schema:
type: object
required: [id, name, email]
properties:
id:
type: integer
name:
type: string
email:
type: string
examples:
user:
value:
id: 1
name: "Alice"
email: "alice@example.com"
"404":
description: Not foundAnd your implementation has two bugs:
Bug A: The response sometimes includes a password_hash field when certain conditions are met. The spec doesn't document password_hash.
Bug B: When id is a very large integer (like 9,007,199,254,740,991), the database query fails with a 500 error.
Dredd result: Finds Bug A if the example user (id=1) triggers the condition. Misses Bug B entirely — it only tests with id=1.
Schemathesis result: Finds Bug B — it generates large integers. May or may not find Bug A depending on whether it checks for undocumented response fields (it does with --checks response_schema_conformance if extra fields are prohibited by additionalProperties: false).
Neither tool alone finds both bugs. Together, they do.
Setup and Usage Comparison
Dredd setup:
npm install -g dredd
# Simple run
dredd openapi.yaml http://localhost:3000
<span class="hljs-comment"># With authentication
<span class="hljs-comment"># hooks.js:
<span class="hljs-comment"># hooks.beforeAll(function(t, done) {
<span class="hljs-comment"># t.forEach(tx => tx.request.headers['Authorization'] = 'Bearer token');
<span class="hljs-comment"># done();
<span class="hljs-comment"># });
dredd openapi.yaml http://localhost:3000 --hookfiles=hooks.jsSchemathesis setup:
pip install schemathesis
# Simple run
st run http://localhost:3000/openapi.yaml
<span class="hljs-comment"># With authentication
st run http://localhost:3000/openapi.yaml \
--header <span class="hljs-string">"Authorization: Bearer token" \
--checks allVerdict on setup: Both are simple for basic cases. Dredd's hooks system (for complex auth) is in JavaScript regardless of your stack. Schemathesis's Python API is more flexible for custom scenarios but requires Python.
Performance Comparison
Dredd: Fast. It makes a fixed number of requests (one per documented example per endpoint). For a 50-endpoint API, it might make 100-200 requests. A run takes seconds to a few minutes.
Schemathesis: Variable. The number of requests depends on --hypothesis-max-examples. At 100 examples per endpoint, a 50-endpoint API makes 5,000 requests minimum. A thorough run takes minutes to tens of minutes.
For CI on every PR:
- Dredd: runs everywhere, always (it's that fast)
- Schemathesis: run with a low example count (25-50) on PR, higher count (200-500) nightly
Language and Framework Support
Dredd: Node.js hooks only. If you're testing a Python or Go API, your hooks are still in JavaScript. This is a real friction point for non-JavaScript teams.
Schemathesis: Python API, Python hooks, pytest integration. More friction for non-Python teams, but the CLI works with any API regardless of language.
Workflow Integration
Dredd fits naturally into API-first workflows:
- Write the OpenAPI spec
- Run Dredd against the stub (it fails)
- Implement the API
- Dredd passes
It's a living specification validator — every CI run checks that reality matches the docs.
Schemathesis fits naturally as a quality gate:
- API is implemented
- Schemathesis runs to find edge case failures
- Failures are fixed or documented
It's most valuable after initial implementation and as an ongoing regression check for new inputs the spec enables.
When to Use Only Dredd
- You're doing API-first development and want spec-code synchronization as the primary concern
- Your team is all JavaScript/Node.js
- You have simple APIs without complex authentication
- You want the fastest possible feedback loop on spec conformance
- You already have comprehensive integration tests for edge cases
When to Use Only Schemathesis
- You want to find server errors from unexpected inputs
- Your API uses GraphQL (Dredd doesn't support it)
- You're a Python shop
- You want pytest integration
- You have security concerns and want to find injection-related server errors
When to Use Both
Most mature API testing pipelines use both:
Dredd → every PR (conformance, spec drift) — under 2 minutes
Schemathesis → every PR at low count (edge cases) — under 10 minutes
Schemathesis → nightly at high count (thorough exploration) — 30-60 minutesThis combination gives you:
- Immediate feedback on spec drift (Dredd)
- Immediate feedback on edge case crashes (Schemathesis low count)
- Thorough exploration overnight (Schemathesis high count)
The two tools are genuinely complementary. Dredd finds "the code doesn't match the docs." Schemathesis finds "the docs allow inputs the code can't handle." Together they cover the full API quality surface.
Recommendation by Team Type
Small team, Python shop: Start with Schemathesis. It's the easier investment and finds higher-severity bugs (server crashes). Add Dredd if spec drift becomes a problem.
Large team, multiple services, API contracts between teams: Start with Dredd. Spec conformance is critical for coordination, and the speed makes it viable everywhere.
Security-conscious team: Start with Schemathesis with security checks enabled. It's your best automated first line against injection and overflow bugs.
API-first design culture: Start with Dredd. It enforces the discipline of spec-first development.
Both tools are free, both are well-maintained, and both add real value. The question is which problem is more urgent for your team right now. You can always add the other one later.