Property-Based Testing: Find Bugs Faster with Generated Inputs

Property-Based Testing: Find Bugs Faster with Generated Inputs

Traditional unit tests are example-based: you write specific inputs and assert specific outputs. Property-based testing flips this approach — you describe what should always be true, and the framework generates hundreds of test cases to try to break it. This often finds bugs that hand-written examples miss.

The Core Idea

Instead of:

def test_sort():
    assert sort([3, 1, 2]) == [1, 2, 3]  # one specific case

You write a property:

def test_sort_properties(lst):
    result = sort(lst)
    assert len(result) == len(lst)             # same length
    assert all(result[i] <= result[i+1] for i in range(len(result)-1))  # sorted
    assert sorted(result) == sorted(lst)       # same elements

The framework then generates thousands of inputs — empty lists, single elements, duplicates, very large lists, negative numbers — and verifies each property holds. When it finds a failing case, it shrinks it to the smallest reproducing example.

Why Property-Based Testing Finds Real Bugs

Example-based tests reflect what you already thought of. Property-based tests systematically explore input space, including:

  • Empty inputs ("", [], {})
  • Boundary values (0, -1, Integer.MAX_VALUE)
  • Unicode and special characters in strings
  • NaN and Infinity in floats
  • Deeply nested or very large structures

A common result: the first time you run property-based tests on existing code, you find bugs you didn't know existed.

What Makes a Good Property?

Properties describe invariants — things that must always be true regardless of input. Common patterns:

Round-trip properties

Encode and decode should return the original:

decoded(encoded(x)) == x

Commutativity

Order shouldn't matter:

add(a, b) == add(b, a)

Associativity

Grouping shouldn't matter:

add(add(a, b), c) == add(a, add(b, c))

Idempotence

Running twice equals running once:

sort(sort(x)) == sort(x)
normalize(normalize(s)) == normalize(s)

Preservation

Something should be unchanged by an operation:

# Sort preserves elements
set(sort(x)) == set(x)

# Filter reduces length
len(filter(pred, x)) <= len(x)

Consistency with a simpler model

# Fast path should match slow path
fast_compute(x) == slow_compute(x)

Oracle-based

Compare against a trusted reference implementation:

our_sort(x) == python_builtin_sort(x)

Property-Based Testing Tools

Language Library Notes
Python Hypothesis Most mature, excellent shrinking
JavaScript/TypeScript fast-check Full-featured, TypeScript-native
Haskell QuickCheck The original
Java jqwik Modern, annotation-based
Scala ScalaCheck Integrated with Scala ecosystem
Go gopter Property testing for Go
Rust proptest Widely used in Rust

Python: Hypothesis

from hypothesis import given, strategies as st

@given(st.lists(st.integers()))
def test_sort_is_idempotent(lst):
    assert sorted(sorted(lst)) == sorted(lst)

@given(st.integers(), st.integers())
def test_add_is_commutative(a, b):
    assert add(a, b) == add(b, a)

@given(st.text())
def test_normalize_is_idempotent(s):
    assert normalize(normalize(s)) == normalize(s)

Hypothesis automatically:

  • Generates varied inputs across runs
  • Remembers previously failing inputs (database)
  • Shrinks failures to minimal examples

JavaScript: fast-check

import fc from 'fast-check';

test('sort idempotent', () => {
  fc.assert(
    fc.property(fc.array(fc.integer()), (arr) => {
      const once = [...arr].sort((a, b) => a - b);
      const twice = [...once].sort((a, b) => a - b);
      expect(twice).toEqual(once);
    })
  );
});

test('string encoding round-trip', () => {
  fc.assert(
    fc.property(fc.string(), (s) => {
      expect(decode(encode(s))).toBe(s);
    })
  );
});

Shrinking: The Secret Weapon

When property-based testing finds a failure, it shrinks the input to the smallest case that still fails. Instead of:

Failing input: [847, -23, 0, 1000000, -1, 42, 7, -999]

You get:

Minimized input: [1, -1]

This makes property-based test failures as easy to debug as hand-written test failures. Shrinking is why property-based testing is practical — without it, you'd get inscrutable 10,000-element failure cases.

A Real Example: URL Parser

from hypothesis import given, strategies as st
from urllib.parse import urlparse, urlunparse

@given(st.from_regex(r'https?://[a-z]{3,10}\.[a-z]{2,4}(/[a-z/]*)?', fullmatch=True))
def test_url_round_trip(url):
    parsed = urlparse(url)
    reconstructed = urlunparse(parsed)
    # Parsing and reconstructing should give equivalent URLs
    assert urlparse(reconstructed) == parsed

Or testing an API client:

@given(
    st.text(min_size=1, max_size=100),  # username
    st.emails(),                         # email
    st.integers(min_value=13, max_value=120),  # age
)
def test_user_creation(username, email, age):
    user = create_user(username=username, email=email, age=age)
    assert user.id is not None
    assert user.username == username.strip()
    assert user.email == email.lower()
    assert user.age == age

Stateful Property Testing

Property-based testing isn't limited to pure functions. You can model stateful systems:

from hypothesis.stateful import RuleBasedStateMachine, rule, initialize

class CartMachine(RuleBasedStateMachine):
    @initialize()
    def create_cart(self):
        self.cart = Cart()
        self.expected_count = 0

    @rule(item=st.builds(Item, price=st.decimals(min_value=0.01, max_value=999)))
    def add_item(self, item):
        self.cart.add(item)
        self.expected_count += 1
        assert len(self.cart.items) == self.expected_count

    @rule()
    def clear(self):
        self.cart.clear()
        self.expected_count = 0
        assert len(self.cart.items) == 0

TestCart = CartMachine.TestCase

Hypothesis generates sequences of operations and finds cases where your state machine violates invariants.

Integrating with Existing Tests

Property-based tests complement, not replace, example-based tests:

  • Example tests: document known behavior, specific requirements, regression cases
  • Property tests: explore unknown edge cases, validate invariants at scale

Use both. Start with example tests for specification, add property tests to validate invariants.

CI Integration

Property-based tests run like any other test. They take longer (more cases per test), so consider:

  • Running the full suite (1,000+ examples) only on main branch merges
  • Using hypothesis database to reproduce past failures locally

Setting explicit max_examples for CI (lower number, faster runs):

@settings(max_examples=100)
@given(st.integers())
def test_something(n): ...

Beyond Unit Testing

Property-based testing finds bugs in your algorithms and business logic. For continuous verification that your application works end-to-end in production — user journeys, integrations, UI behavior — HelpMeTest provides AI-powered functional testing with 24/7 monitoring.

Start free with HelpMeTest — 10 tests, no code required.

Read more