Testing

Database Testing Guide: Pytest, SQLAlchemy, and Beyond

HelpMeTest

13 May 2026 — 7 min read

Database bugs are among the hardest to catch and the most expensive to fix. Data corruption, failed migrations, and broken queries don't always manifest in application tests — they often surface in production, after the fact.

This guide covers database testing from the ground up: writing unit tests for database operations, setting up proper test isolation, testing schema migrations, and monitoring production database health.

Why Database Testing Deserves Its Own Strategy

Application code tests verify behavior. Database tests verify data integrity — a fundamentally different concern.

What application tests miss:

Constraint violations that only appear with real data
Index performance degradation under load
Migration failures on databases with existing data
Cascading deletions that corrupt related records
Race conditions in concurrent writes
Null handling in aggregate queries

What can go wrong without database testing:

A migration runs fine on an empty staging database, fails on production with 10 million rows
A query returns correct results in unit tests (SQLite) but fails silently in production (PostgreSQL) due to different NULL semantics
A foreign key constraint that was never enforced in tests gets violated in production

The solution is to test databases the way you test code: with isolation, repeatability, and real assertions.

Setting Up Pytest for Database Testing

Project Structure

myapp/
├── src/
│   ├── models.py       # SQLAlchemy models
│   ├── database.py     # Engine and session setup
│   └── repositories/
│       ├── user_repo.py
│       └── order_repo.py
├── tests/
│   ├── conftest.py     # Shared fixtures
│   ├── test_user_repo.py
│   ├── test_order_repo.py
│   └── test_migrations.py
└── alembic/
    └── versions/       # Migration files

Core Fixtures in conftest.py

The most important design decision: use rollback-based isolation instead of creating and dropping databases per test. This is 5–10x faster and more reliable.

# tests/conftest.py
import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from src.models import Base
from src.database import get_session

# Use an in-memory SQLite database for unit tests
TEST_DATABASE_URL = "sqlite:///:memory:"
# For integration tests that need PostgreSQL behavior:
# TEST_DATABASE_URL = "postgresql://user:pass@localhost:5432/testdb"

@pytest.fixture(scope="session")
def engine():
    """Create the test database engine once per test session."""
    engine = create_engine(TEST_DATABASE_URL)
    Base.metadata.create_all(engine)
    yield engine
    Base.metadata.drop_all(engine)
    engine.dispose()


@pytest.fixture(scope="function")
def db_session(engine):
    """
    Create a fresh database session for each test, with automatic rollback.
    
    This wraps each test in a transaction that gets rolled back at the end,
    keeping tests isolated without the overhead of recreating the schema.
    """
    connection = engine.connect()
    transaction = connection.begin()
    Session = sessionmaker(bind=connection)
    session = Session()
    
    yield session
    
    session.close()
    transaction.rollback()
    connection.close()

Every test gets a clean database state. No cleanup code needed. No test pollution.

Why Not Drop and Recreate?

# ❌ Slow — recreates schema for every test
@pytest.fixture
def db_session():
    Base.metadata.create_all(engine)
    session = Session()
    yield session
    session.close()
    Base.metadata.drop_all(engine)

# ✅ Fast — wraps in transaction, rolls back
@pytest.fixture
def db_session(engine):
    connection = engine.connect()
    transaction = connection.begin()
    session = Session(bind=connection)
    yield session
    session.close()
    transaction.rollback()
    connection.close()

A test suite that takes 112 minutes with drop/recreate takes 14 minutes with rollback fixtures — an 8x speedup from one architecture change.

Testing Database Models

CRUD Operations

# tests/test_user_repo.py
from src.repositories.user_repo import UserRepository
from src.models import User

def test_create_user(db_session):
    repo = UserRepository(db_session)
    user = repo.create(email="test@example.com", name="Test User")
    
    assert user.id is not None
    assert user.email == "test@example.com"
    assert user.created_at is not None


def test_get_user_by_email(db_session):
    repo = UserRepository(db_session)
    repo.create(email="findme@example.com", name="Find Me")
    
    found = repo.get_by_email("findme@example.com")
    
    assert found is not None
    assert found.name == "Find Me"


def test_get_nonexistent_user_returns_none(db_session):
    repo = UserRepository(db_session)
    
    result = repo.get_by_email("nobody@example.com")
    
    assert result is None


def test_update_user(db_session):
    repo = UserRepository(db_session)
    user = repo.create(email="update@example.com", name="Old Name")
    
    updated = repo.update(user.id, name="New Name")
    
    assert updated.name == "New Name"
    # Verify the original object was also updated (same session)
    assert user.name == "New Name"


def test_delete_user(db_session):
    repo = UserRepository(db_session)
    user = repo.create(email="delete@example.com", name="Delete Me")
    user_id = user.id
    
    repo.delete(user_id)
    
    assert repo.get_by_id(user_id) is None

Testing Constraints

def test_duplicate_email_raises_integrity_error(db_session):
    repo = UserRepository(db_session)
    repo.create(email="duplicate@example.com", name="First")
    
    with pytest.raises(IntegrityError):
        repo.create(email="duplicate@example.com", name="Second")


def test_email_cannot_be_null(db_session):
    with pytest.raises((IntegrityError, ValidationError)):
        User(email=None, name="No Email")
        db_session.flush()


def test_cascade_delete_removes_orders(db_session):
    # Create user with orders
    user = User(email="cascade@example.com", name="Cascade Test")
    db_session.add(user)
    db_session.flush()
    
    order = Order(user_id=user.id, total=99.99)
    db_session.add(order)
    db_session.flush()
    
    # Delete user
    db_session.delete(user)
    db_session.flush()
    
    # Orders should be gone too
    remaining_orders = db_session.query(Order).filter_by(user_id=user.id).all()
    assert len(remaining_orders) == 0

Testing Queries and Aggregates

def test_count_active_users(db_session):
    repo = UserRepository(db_session)
    repo.create(email="active1@example.com", name="Active 1", active=True)
    repo.create(email="active2@example.com", name="Active 2", active=True)
    repo.create(email="inactive@example.com", name="Inactive", active=False)
    
    count = repo.count_active()
    
    assert count == 2


def test_get_orders_in_date_range(db_session):
    # Create orders across different dates
    create_order(db_session, date=date(2026, 1, 1), total=10.00)
    create_order(db_session, date=date(2026, 3, 15), total=25.00)
    create_order(db_session, date=date(2026, 6, 30), total=50.00)
    
    orders = OrderRepository(db_session).get_in_range(
        start=date(2026, 1, 1),
        end=date(2026, 4, 1)
    )
    
    assert len(orders) == 2
    assert sum(o.total for o in orders) == 35.00

Testing Database Migrations

Migration testing is where most teams have gaps. A migration that works on a clean database often fails on a database with years of production data.

Testing Migration Up/Down

# tests/test_migrations.py
from alembic.config import Config
from alembic import command
from alembic.runtime.migration import MigrationContext
from alembic.script import ScriptDirectory

@pytest.fixture
def migration_engine():
    """Separate engine for migration tests — needs DDL operations."""
    engine = create_engine("postgresql://user:pass@localhost:5432/migration_test")
    yield engine
    engine.dispose()


def test_migration_up_and_down(migration_engine):
    """Verify the latest migration can be applied and rolled back."""
    alembic_cfg = Config("alembic.ini")
    alembic_cfg.set_main_option("sqlalchemy.url", str(migration_engine.url))
    
    # Apply migration
    command.upgrade(alembic_cfg, "head")
    
    # Verify schema is as expected
    with migration_engine.connect() as conn:
        context = MigrationContext.configure(conn)
        current_rev = context.get_current_revision()
        assert current_rev is not None
    
    # Roll back
    command.downgrade(alembic_cfg, "-1")
    
    # Verify rollback worked
    with migration_engine.connect() as conn:
        context = MigrationContext.configure(conn)
        rolled_back_rev = context.get_current_revision()
        assert rolled_back_rev != current_rev


def test_migration_with_existing_data(migration_engine):
    """Verify migration handles existing data correctly."""
    # Create state before migration
    with migration_engine.connect() as conn:
        conn.execute(text("""
            INSERT INTO users (email, name) 
            VALUES ('existing@example.com', 'Existing User')
        """))
        conn.commit()
    
    # Apply migration
    alembic_cfg = Config("alembic.ini")
    command.upgrade(alembic_cfg, "head")
    
    # Verify existing data survived
    with migration_engine.connect() as conn:
        result = conn.execute(text("SELECT COUNT(*) FROM users")).scalar()
        assert result == 1


def test_no_pending_migrations():
    """CI check: fail if there are unapplied migrations."""
    alembic_cfg = Config("alembic.ini")
    script = ScriptDirectory.from_config(alembic_cfg)
    
    with engine.connect() as conn:
        context = MigrationContext.configure(conn)
        current_heads = set(context.get_current_heads())
        script_heads = set(script.get_heads())
        
        assert current_heads == script_heads, (
            f"Unapplied migrations: {script_heads - current_heads}"
        )

Testing Migrations in CI

Add migration testing to your CI pipeline:

# .github/workflows/migrations.yml
name: Migration Tests
on: [push, pull_request]

jobs:
  test-migrations:
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: testpass
          POSTGRES_DB: migration_test
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5

    steps:
      - uses: actions/checkout@v3
      - name: Test migrations apply cleanly
        run: |
          alembic upgrade head
          python -m pytest tests/test_migrations.py -v

SQLite vs PostgreSQL in Tests

SQLite is fast and convenient for unit tests. PostgreSQL is what runs in production. Use both appropriately:

Use case	Database	Why
Unit tests (repository logic)	SQLite in-memory	Fast, isolated
Integration tests (constraints, transactions)	PostgreSQL	Real behavior
Migration tests	PostgreSQL	DDL differences matter
Performance tests	PostgreSQL	Explain plans, indexes

SQLite differences that bite teams:

AUTOINCREMENT behavior differs
STRICT mode not default
Full-text search works differently
NOW() has different precision
No RETURNING clause support (before SQLite 3.35)
NULL comparisons behave differently in some edge cases

For tests that validate business logic: SQLite is fine. For tests that validate data integrity and constraints: use PostgreSQL.

Testing Async Database Code

FastAPI and modern Python apps increasingly use async database access. Testing async code requires async fixtures:

# For SQLAlchemy async
import pytest
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
from sqlalchemy.orm import sessionmaker

@pytest.fixture
async def async_session():
    engine = create_async_engine("sqlite+aiosqlite:///:memory:")
    async with engine.begin() as conn:
        await conn.run_sync(Base.metadata.create_all)
    
    AsyncSessionLocal = sessionmaker(
        engine, class_=AsyncSession, expire_on_commit=False
    )
    
    async with AsyncSessionLocal() as session:
        yield session
    
    await engine.dispose()


@pytest.mark.asyncio
async def test_async_create_user(async_session):
    repo = AsyncUserRepository(async_session)
    user = await repo.create(email="async@example.com", name="Async User")
    
    assert user.id is not None

Install pytest-asyncio and configure it:

# pytest.ini
[pytest]
asyncio_mode = auto

Monitoring Database Health in Production

Automated tests tell you your database logic is correct. Monitoring tells you your database is healthy.

Key database health signals to monitor:

Query performance

# Monitor slow query percentage
helpmetest health db-slow-queries 5m

Connection pool Monitor connection pool exhaustion — it manifests as application timeouts, not database errors.

Replication lag For read replicas, monitor replication lag. Stale reads cause subtle bugs that are hard to reproduce.

Disk usage Database storage fills up suddenly when data volumes spike. Monitor disk usage trends, not just current state.

Failed jobs Background jobs that write to the database often fail silently. Monitor your job queue for failed runs.

Property-Based Testing for Databases

For complex queries, use property-based testing to verify invariants:

from hypothesis import given, strategies as st
from hypothesis.extra.django import TestCase

class TestOrderCalculations(TestCase):
    @given(
        quantities=st.lists(st.integers(min_value=1, max_value=100), min_size=1, max_size=10),
        prices=st.lists(st.decimals(min_value="0.01", max_value="999.99"), min_size=1, max_size=10)
    )
    def test_order_total_equals_sum_of_line_items(self, quantities, prices):
        # Pair quantities with prices (use shortest list)
        line_items = list(zip(quantities, prices))
        
        order = Order.create(line_items=line_items)
        
        expected_total = sum(q * p for q, p in line_items)
        assert order.total == pytest.approx(float(expected_total), rel=1e-5)

Hypothesis generates hundreds of combinations automatically, finding edge cases that hand-written tests miss.

Summary: Database Testing Checklist

Unit tests (run in CI on every PR):

CRUD operations for each repository
Constraint violation tests
Edge cases (empty results, null values, boundary conditions)
Business logic in aggregates and queries

Integration tests (run in CI on every PR):

Tests use PostgreSQL (not just SQLite)
Concurrent write scenarios
Transaction rollback behavior
Foreign key constraint enforcement

Migration tests (run in CI on every PR):

Latest migration applies cleanly
Latest migration rolls back cleanly
No pending migrations check
Migration with existing data test

Production monitoring:

Slow query monitoring
Connection pool health
Replication lag (if applicable)
Disk usage trending

The database is the one part of your system that doesn't reset between deployments. Bugs in database logic compound over time. Testing it thoroughly is an investment that pays back every time a migration works correctly and every time a query returns the right result.

Monitor your database-backed application in production →