Running pytest Tests in Parallel with pytest-xdist

Running pytest Tests in Parallel with pytest-xdist

pytest-xdist lets you run tests across multiple CPUs — or even multiple machines — without changing a single test. A 10-minute suite can drop to 2-3 minutes with 4 workers. Here's how to set it up correctly, including the database isolation patterns that trip up most teams.

Installation

pip install pytest-xdist

Basic Parallel Execution

# Use number of available CPUs
pytest -n auto

<span class="hljs-comment"># Use exactly 4 workers
pytest -n 4

<span class="hljs-comment"># Use 2 workers on 2-core CI machines
pytest -n 2

-n auto detects available CPUs and spawns that many worker processes. Each worker is a separate Python process — full isolation, but higher startup overhead than threads.

How xdist Distributes Tests

xdist has three distribution modes (--dist):

load (default): Workers pull tests from a queue. Each worker picks the next available test when it finishes its current one. Good for mixed-duration tests — fast workers pick up more tests automatically.

pytest -n 4 --dist=load

loadscope: Tests in the same class or module run on the same worker. Useful when class-level fixtures (scope="class") need to run on one worker.

pytest -n 4 --dist=loadscope

loadfile: All tests in the same file run on the same worker. Best when you have file-level fixtures (scope="module") or when test files share state.

pytest -n 4 --dist=loadfile

no: Disable parallelism. Useful for debugging.

Database Isolation Per Worker

The most common xdist failure: parallel workers share a database and corrupt each other's state.

Pattern 1: Per-worker database

# conftest.py
import pytest

@pytest.fixture(scope="session")
def worker_id(tmp_path_factory):
    """Returns unique ID per xdist worker (gw0, gw1, etc. or 'master' for no xdist)"""
    import os
    return os.environ.get("PYTEST_XDIST_WORKER", "master")

@pytest.fixture(scope="session")
def database_url(worker_id):
    """Each worker gets its own database."""
    db_name = f"test_db_{worker_id}"
    url = f"postgresql://localhost/{db_name}"

    # Create and migrate the worker's database
    import subprocess
    subprocess.run(["createdb", db_name], check=False)
    subprocess.run(["python", "manage.py", "migrate", f"--database={db_name}"], check=True)

    yield url

    # Cleanup (optional — useful in CI, annoying locally)
    subprocess.run(["dropdb", db_name], check=False)

Pattern 2: Transaction rollback

The cleanest option — each test runs in a transaction that's rolled back after the test:

# conftest.py
import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

@pytest.fixture(scope="session")
def engine():
    return create_engine("postgresql://localhost/test_db")

@pytest.fixture(scope="function")
def db_session(engine):
    """Each test gets a rolled-back transaction — zero cleanup needed."""
    connection = engine.connect()
    transaction = connection.begin()

    Session = sessionmaker(bind=connection)
    session = Session()

    yield session

    session.close()
    transaction.rollback()
    connection.close()

Transaction rollback is fast (no actual DB writes) and works perfectly with xdist because each worker handles its own connection.

Pattern 3: Shared database with unique data

When you can't use per-worker DBs or transactions:

import uuid

@pytest.fixture
def unique_user(db_session):
    """Creates a user with a unique email to avoid conflicts across workers."""
    user = User(
        email=f"test-{uuid.uuid4()}@example.com",
        name="Test User"
    )
    db_session.add(user)
    db_session.commit()
    return user

Fixtures and xdist

xdist workers can't share fixture state — each worker runs its own fixture setup. This matters for expensive fixtures:

# BAD: session-scoped fixtures that need coordination
@pytest.fixture(scope="session")
def loaded_cache():
    # This runs once PER WORKER, not once total
    return load_expensive_data()  # called 4 times with 4 workers

For truly shared setup (once per test run, not per worker), use scope="session" with xdist's locking:

import pytest
from filelock import FileLock

@pytest.fixture(scope="session")
def shared_data(tmp_path_factory):
    """Shared across all workers — only executed once."""
    fn = tmp_path_factory.getbasetemp().parent / "shared_data.json"
    lock = FileLock(str(fn) + ".lock")

    with lock:
        if not fn.exists():
            # Only the first worker to acquire the lock runs this
            data = expensive_setup()
            fn.write_text(json.dumps(data))

    return json.loads(fn.read_text())

Worker Detection

Check at runtime whether xdist is active:

def is_xdist_worker():
    return "PYTEST_XDIST_WORKER" in os.environ

def worker_id():
    return os.environ.get("PYTEST_XDIST_WORKER", "master")

# Skip certain tests when running in parallel
@pytest.mark.skipif(is_xdist_worker(), reason="Cannot run in parallel")
def test_requires_exclusive_resource():
    pass

Or use pytest.ini to mark tests:

[pytest]
markers =
    serial: mark test as requiring serial execution
# Run serial tests first, then parallel
pytest -m serial
pytest -n 4 -m <span class="hljs-string">"not serial"

GitHub Actions with xdist

- name: Run pytest in parallel
  run: pytest -n auto --dist=loadfile --junitxml=test-results/junit.xml
  env:
    DATABASE_URL: postgresql://postgres:postgres@localhost/test_db

services:
  postgres:
    image: postgres:16
    env:
      POSTGRES_PASSWORD: postgres
      POSTGRES_DB: test_db
    options: >-
      --health-cmd pg_isready
      --health-interval 10s
      --health-timeout 5s
      --health-retries 5

For very large suites, combine xdist parallelism with GitHub Actions matrix sharding:

strategy:
  matrix:
    group: [1, 2, 3, 4]

steps:
  - name: Run pytest shard
    run: |
      # Split test files manually by group
      ALL_FILES=$(pytest --collect-only -q 2>/dev/null | grep "\.py::" | awk -F'::' '{print $1}' | sort -u)
      TOTAL=$(echo "$ALL_FILES" | wc -l)
      PER_GROUP=$(( (TOTAL + 3) / 4 ))
      GROUP_FILES=$(echo "$ALL_FILES" | sed -n "$(( (${{ matrix.group }} - 1) * PER_GROUP + 1 )),$(( ${{ matrix.group }} * PER_GROUP ))p")
      pytest -n auto $GROUP_FILES --junitxml=test-results/junit-${{ matrix.group }}.xml

Debugging Parallel Failures

Tests that pass serially but fail in parallel usually have one of these causes:

Shared mutable state:

# Confirm it's a parallelism issue
pytest -n 4 --randomly-seed=12345  <span class="hljs-comment"># reproduce with specific seed
pytest -n 1 --randomly-seed=12345  <span class="hljs-comment"># same order, no parallelism

Import-level side effects:

# BAD: runs at import time, shared across workers
_cache = {}

def get_cached(key):
    if key not in _cache:
        _cache[key] = expensive_lookup(key)
    return _cache[key]

Temporary file collisions:

# BAD: same path across workers
with open("/tmp/test_output.txt", "w") as f:
    f.write(result)

# GOOD: unique per test
def test_something(tmp_path):
    output = tmp_path / "output.txt"  # pytest tmp_path is unique per test
    output.write_text(result)

Expected Performance

With pytest-xdist -n auto on a 4-core machine:

Test count Serial 4 workers
100 2 min 35s
500 10 min 2.5 min
2000 45 min 12 min

Database setup overhead reduces gains for test suites with expensive fixtures.

Summary

pytest-xdist parallelizes by spawning worker processes: pytest -n 4 or pytest -n auto. Choose --dist=loadfile when you have module-scoped fixtures, --dist=load for uniform tests. Isolate database access per-worker using separate DBs or transaction rollback. Use PYTEST_XDIST_WORKER env var to detect worker context in fixtures. Start with -n auto and measure — the speedup on CPU-bound tests is substantial.

Read more