Running pytest Tests in Parallel with pytest-xdist
pytest-xdist lets you run tests across multiple CPUs — or even multiple machines — without changing a single test. A 10-minute suite can drop to 2-3 minutes with 4 workers. Here's how to set it up correctly, including the database isolation patterns that trip up most teams.
Installation
pip install pytest-xdistBasic Parallel Execution
# Use number of available CPUs
pytest -n auto
<span class="hljs-comment"># Use exactly 4 workers
pytest -n 4
<span class="hljs-comment"># Use 2 workers on 2-core CI machines
pytest -n 2-n auto detects available CPUs and spawns that many worker processes. Each worker is a separate Python process — full isolation, but higher startup overhead than threads.
How xdist Distributes Tests
xdist has three distribution modes (--dist):
load (default): Workers pull tests from a queue. Each worker picks the next available test when it finishes its current one. Good for mixed-duration tests — fast workers pick up more tests automatically.
pytest -n 4 --dist=loadloadscope: Tests in the same class or module run on the same worker. Useful when class-level fixtures (scope="class") need to run on one worker.
pytest -n 4 --dist=loadscopeloadfile: All tests in the same file run on the same worker. Best when you have file-level fixtures (scope="module") or when test files share state.
pytest -n 4 --dist=loadfileno: Disable parallelism. Useful for debugging.
Database Isolation Per Worker
The most common xdist failure: parallel workers share a database and corrupt each other's state.
Pattern 1: Per-worker database
# conftest.py
import pytest
@pytest.fixture(scope="session")
def worker_id(tmp_path_factory):
"""Returns unique ID per xdist worker (gw0, gw1, etc. or 'master' for no xdist)"""
import os
return os.environ.get("PYTEST_XDIST_WORKER", "master")
@pytest.fixture(scope="session")
def database_url(worker_id):
"""Each worker gets its own database."""
db_name = f"test_db_{worker_id}"
url = f"postgresql://localhost/{db_name}"
# Create and migrate the worker's database
import subprocess
subprocess.run(["createdb", db_name], check=False)
subprocess.run(["python", "manage.py", "migrate", f"--database={db_name}"], check=True)
yield url
# Cleanup (optional — useful in CI, annoying locally)
subprocess.run(["dropdb", db_name], check=False)Pattern 2: Transaction rollback
The cleanest option — each test runs in a transaction that's rolled back after the test:
# conftest.py
import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
@pytest.fixture(scope="session")
def engine():
return create_engine("postgresql://localhost/test_db")
@pytest.fixture(scope="function")
def db_session(engine):
"""Each test gets a rolled-back transaction — zero cleanup needed."""
connection = engine.connect()
transaction = connection.begin()
Session = sessionmaker(bind=connection)
session = Session()
yield session
session.close()
transaction.rollback()
connection.close()Transaction rollback is fast (no actual DB writes) and works perfectly with xdist because each worker handles its own connection.
Pattern 3: Shared database with unique data
When you can't use per-worker DBs or transactions:
import uuid
@pytest.fixture
def unique_user(db_session):
"""Creates a user with a unique email to avoid conflicts across workers."""
user = User(
email=f"test-{uuid.uuid4()}@example.com",
name="Test User"
)
db_session.add(user)
db_session.commit()
return userFixtures and xdist
xdist workers can't share fixture state — each worker runs its own fixture setup. This matters for expensive fixtures:
# BAD: session-scoped fixtures that need coordination
@pytest.fixture(scope="session")
def loaded_cache():
# This runs once PER WORKER, not once total
return load_expensive_data() # called 4 times with 4 workersFor truly shared setup (once per test run, not per worker), use scope="session" with xdist's locking:
import pytest
from filelock import FileLock
@pytest.fixture(scope="session")
def shared_data(tmp_path_factory):
"""Shared across all workers — only executed once."""
fn = tmp_path_factory.getbasetemp().parent / "shared_data.json"
lock = FileLock(str(fn) + ".lock")
with lock:
if not fn.exists():
# Only the first worker to acquire the lock runs this
data = expensive_setup()
fn.write_text(json.dumps(data))
return json.loads(fn.read_text())Worker Detection
Check at runtime whether xdist is active:
def is_xdist_worker():
return "PYTEST_XDIST_WORKER" in os.environ
def worker_id():
return os.environ.get("PYTEST_XDIST_WORKER", "master")
# Skip certain tests when running in parallel
@pytest.mark.skipif(is_xdist_worker(), reason="Cannot run in parallel")
def test_requires_exclusive_resource():
passOr use pytest.ini to mark tests:
[pytest]
markers =
serial: mark test as requiring serial execution# Run serial tests first, then parallel
pytest -m serial
pytest -n 4 -m <span class="hljs-string">"not serial"GitHub Actions with xdist
- name: Run pytest in parallel
run: pytest -n auto --dist=loadfile --junitxml=test-results/junit.xml
env:
DATABASE_URL: postgresql://postgres:postgres@localhost/test_db
services:
postgres:
image: postgres:16
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: test_db
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5For very large suites, combine xdist parallelism with GitHub Actions matrix sharding:
strategy:
matrix:
group: [1, 2, 3, 4]
steps:
- name: Run pytest shard
run: |
# Split test files manually by group
ALL_FILES=$(pytest --collect-only -q 2>/dev/null | grep "\.py::" | awk -F'::' '{print $1}' | sort -u)
TOTAL=$(echo "$ALL_FILES" | wc -l)
PER_GROUP=$(( (TOTAL + 3) / 4 ))
GROUP_FILES=$(echo "$ALL_FILES" | sed -n "$(( (${{ matrix.group }} - 1) * PER_GROUP + 1 )),$(( ${{ matrix.group }} * PER_GROUP ))p")
pytest -n auto $GROUP_FILES --junitxml=test-results/junit-${{ matrix.group }}.xmlDebugging Parallel Failures
Tests that pass serially but fail in parallel usually have one of these causes:
Shared mutable state:
# Confirm it's a parallelism issue
pytest -n 4 --randomly-seed=12345 <span class="hljs-comment"># reproduce with specific seed
pytest -n 1 --randomly-seed=12345 <span class="hljs-comment"># same order, no parallelismImport-level side effects:
# BAD: runs at import time, shared across workers
_cache = {}
def get_cached(key):
if key not in _cache:
_cache[key] = expensive_lookup(key)
return _cache[key]Temporary file collisions:
# BAD: same path across workers
with open("/tmp/test_output.txt", "w") as f:
f.write(result)
# GOOD: unique per test
def test_something(tmp_path):
output = tmp_path / "output.txt" # pytest tmp_path is unique per test
output.write_text(result)Expected Performance
With pytest-xdist -n auto on a 4-core machine:
| Test count | Serial | 4 workers |
|---|---|---|
| 100 | 2 min | 35s |
| 500 | 10 min | 2.5 min |
| 2000 | 45 min | 12 min |
Database setup overhead reduces gains for test suites with expensive fixtures.
Summary
pytest-xdist parallelizes by spawning worker processes: pytest -n 4 or pytest -n auto. Choose --dist=loadfile when you have module-scoped fixtures, --dist=load for uniform tests. Isolate database access per-worker using separate DBs or transaction rollback. Use PYTEST_XDIST_WORKER env var to detect worker context in fixtures. Start with -n auto and measure — the speedup on CPU-bound tests is substantial.