Property-Based Testing: Find Bugs Faster with Generated Inputs
Traditional unit tests are example-based: you write specific inputs and assert specific outputs. Property-based testing flips this approach — you describe what should always be true, and the framework generates hundreds of test cases to try to break it. This often finds bugs that hand-written examples miss.
The Core Idea
Instead of:
def test_sort():
assert sort([3, 1, 2]) == [1, 2, 3] # one specific caseYou write a property:
def test_sort_properties(lst):
result = sort(lst)
assert len(result) == len(lst) # same length
assert all(result[i] <= result[i+1] for i in range(len(result)-1)) # sorted
assert sorted(result) == sorted(lst) # same elementsThe framework then generates thousands of inputs — empty lists, single elements, duplicates, very large lists, negative numbers — and verifies each property holds. When it finds a failing case, it shrinks it to the smallest reproducing example.
Why Property-Based Testing Finds Real Bugs
Example-based tests reflect what you already thought of. Property-based tests systematically explore input space, including:
- Empty inputs (
"",[],{}) - Boundary values (
0,-1,Integer.MAX_VALUE) - Unicode and special characters in strings
- NaN and Infinity in floats
- Deeply nested or very large structures
A common result: the first time you run property-based tests on existing code, you find bugs you didn't know existed.
What Makes a Good Property?
Properties describe invariants — things that must always be true regardless of input. Common patterns:
Round-trip properties
Encode and decode should return the original:
decoded(encoded(x)) == xCommutativity
Order shouldn't matter:
add(a, b) == add(b, a)Associativity
Grouping shouldn't matter:
add(add(a, b), c) == add(a, add(b, c))Idempotence
Running twice equals running once:
sort(sort(x)) == sort(x)
normalize(normalize(s)) == normalize(s)Preservation
Something should be unchanged by an operation:
# Sort preserves elements
set(sort(x)) == set(x)
# Filter reduces length
len(filter(pred, x)) <= len(x)Consistency with a simpler model
# Fast path should match slow path
fast_compute(x) == slow_compute(x)Oracle-based
Compare against a trusted reference implementation:
our_sort(x) == python_builtin_sort(x)Property-Based Testing Tools
| Language | Library | Notes |
|---|---|---|
| Python | Hypothesis | Most mature, excellent shrinking |
| JavaScript/TypeScript | fast-check | Full-featured, TypeScript-native |
| Haskell | QuickCheck | The original |
| Java | jqwik | Modern, annotation-based |
| Scala | ScalaCheck | Integrated with Scala ecosystem |
| Go | gopter | Property testing for Go |
| Rust | proptest | Widely used in Rust |
Python: Hypothesis
from hypothesis import given, strategies as st
@given(st.lists(st.integers()))
def test_sort_is_idempotent(lst):
assert sorted(sorted(lst)) == sorted(lst)
@given(st.integers(), st.integers())
def test_add_is_commutative(a, b):
assert add(a, b) == add(b, a)
@given(st.text())
def test_normalize_is_idempotent(s):
assert normalize(normalize(s)) == normalize(s)Hypothesis automatically:
- Generates varied inputs across runs
- Remembers previously failing inputs (database)
- Shrinks failures to minimal examples
JavaScript: fast-check
import fc from 'fast-check';
test('sort idempotent', () => {
fc.assert(
fc.property(fc.array(fc.integer()), (arr) => {
const once = [...arr].sort((a, b) => a - b);
const twice = [...once].sort((a, b) => a - b);
expect(twice).toEqual(once);
})
);
});
test('string encoding round-trip', () => {
fc.assert(
fc.property(fc.string(), (s) => {
expect(decode(encode(s))).toBe(s);
})
);
});Shrinking: The Secret Weapon
When property-based testing finds a failure, it shrinks the input to the smallest case that still fails. Instead of:
Failing input: [847, -23, 0, 1000000, -1, 42, 7, -999]You get:
Minimized input: [1, -1]This makes property-based test failures as easy to debug as hand-written test failures. Shrinking is why property-based testing is practical — without it, you'd get inscrutable 10,000-element failure cases.
A Real Example: URL Parser
from hypothesis import given, strategies as st
from urllib.parse import urlparse, urlunparse
@given(st.from_regex(r'https?://[a-z]{3,10}\.[a-z]{2,4}(/[a-z/]*)?', fullmatch=True))
def test_url_round_trip(url):
parsed = urlparse(url)
reconstructed = urlunparse(parsed)
# Parsing and reconstructing should give equivalent URLs
assert urlparse(reconstructed) == parsedOr testing an API client:
@given(
st.text(min_size=1, max_size=100), # username
st.emails(), # email
st.integers(min_value=13, max_value=120), # age
)
def test_user_creation(username, email, age):
user = create_user(username=username, email=email, age=age)
assert user.id is not None
assert user.username == username.strip()
assert user.email == email.lower()
assert user.age == ageStateful Property Testing
Property-based testing isn't limited to pure functions. You can model stateful systems:
from hypothesis.stateful import RuleBasedStateMachine, rule, initialize
class CartMachine(RuleBasedStateMachine):
@initialize()
def create_cart(self):
self.cart = Cart()
self.expected_count = 0
@rule(item=st.builds(Item, price=st.decimals(min_value=0.01, max_value=999)))
def add_item(self, item):
self.cart.add(item)
self.expected_count += 1
assert len(self.cart.items) == self.expected_count
@rule()
def clear(self):
self.cart.clear()
self.expected_count = 0
assert len(self.cart.items) == 0
TestCart = CartMachine.TestCaseHypothesis generates sequences of operations and finds cases where your state machine violates invariants.
Integrating with Existing Tests
Property-based tests complement, not replace, example-based tests:
- Example tests: document known behavior, specific requirements, regression cases
- Property tests: explore unknown edge cases, validate invariants at scale
Use both. Start with example tests for specification, add property tests to validate invariants.
CI Integration
Property-based tests run like any other test. They take longer (more cases per test), so consider:
- Running the full suite (1,000+ examples) only on main branch merges
- Using hypothesis database to reproduce past failures locally
Setting explicit max_examples for CI (lower number, faster runs):
@settings(max_examples=100)
@given(st.integers())
def test_something(n): ...Beyond Unit Testing
Property-based testing finds bugs in your algorithms and business logic. For continuous verification that your application works end-to-end in production — user journeys, integrations, UI behavior — HelpMeTest provides AI-powered functional testing with 24/7 monitoring.
Start free with HelpMeTest — 10 tests, no code required.