Observability

Log-Based Testing: Asserting Log Output in Integration Tests

HelpMeTest

17 May 2026 — 7 min read

Logs are observable outputs of your system. Like return values and HTTP responses, log output can be tested—and should be. This guide covers structured log assertions in integration tests, capturing logs in Python and Java, testing log level and context fields, asserting no unexpected errors appear during a test run, and enforcing log schema contracts with JSON Schema.

Key Takeaways

Logs are outputs. Test them. If your code emits a structured log event under certain conditions, that event is part of the contract. Test it like any other output.

The "no unexpected errors" gate is the highest-value log test. A single assertion that checks for ERROR or CRITICAL level logs during an otherwise-clean integration test run catches entire classes of regressions.

JSON Schema log contracts prevent silent schema drift. Downstream log processors, dashboards, and alerts depend on specific field names. Schema tests catch breaking changes before they silently corrupt your log pipeline.

Logs as First-Class Test Assertions

Most teams treat logs as a debugging aid—something you look at after a test fails. But logs are outputs of your system. When code runs a payment, it might emit:

{
  "level": "info",
  "event": "payment.processed",
  "user_id": "u_123",
  "amount": 4900,
  "currency": "USD",
  "provider": "stripe",
  "duration_ms": 143
}

That log event is observable, deterministic, and part of the behavior contract. Your log analytics pipeline parses it. Your dashboards count payment.processed events. Your alerting queries the duration_ms field. If the code stops emitting it, or emits it with a renamed field, your monitoring silently breaks.

Log-based testing makes these contracts explicit and machine-verifiable.

Capturing Logs in Python Tests

Python's logging module supports custom handlers, making it straightforward to capture log records during tests.

Using caplog (pytest built-in)

pytest's built-in caplog fixture captures log records at any level:

# tests/test_payment.py
import logging

def test_payment_emits_processed_log(caplog):
    with caplog.at_level(logging.INFO, logger='myapp.payments'):
        result = process_payment(amount=4900, currency='USD', user_id='u_123')

    assert result['status'] == 'ok'

    # Find the payment.processed log record
    processed_logs = [
        r for r in caplog.records
        if getattr(r, 'event', None) == 'payment.processed'
    ]

    assert len(processed_logs) == 1, \
        f"Expected 1 payment.processed log, got {len(processed_logs)}"

    record = processed_logs[0]
    assert record.levelname == 'INFO'
    assert record.user_id == 'u_123'   # custom field set via extra={}
    assert record.amount == 4900
    assert record.currency == 'USD'

For structlog (common in Python services), capture the bound logger's output:

# tests/conftest.py
import structlog
import pytest

@pytest.fixture()
def log_capture():
    """Capture structlog output during tests."""
    captured = []

    def capture_processor(logger, method, event_dict):
        captured.append(dict(event_dict))
        return event_dict

    structlog.configure(
        processors=[
            capture_processor,
            structlog.dev.ConsoleRenderer(),
        ]
    )

    yield captured

    structlog.reset_defaults()


# tests/test_payment.py
def test_payment_emits_structured_log(log_capture):
    process_payment(amount=1000, currency='EUR', user_id='u_456')

    payment_events = [e for e in log_capture if e.get('event') == 'payment.processed']
    assert len(payment_events) == 1

    event = payment_events[0]
    assert event['level'] == 'info'
    assert event['user_id'] == 'u_456'
    assert event['amount'] == 1000
    assert 'duration_ms' in event
    assert isinstance(event['duration_ms'], int)

Custom Handler for Standard Logging

# tests/helpers/log_capture.py
import logging
import json

class JsonLogCapture(logging.Handler):
    def __init__(self):
        super().__init__()
        self.records = []

    def emit(self, record):
        # Works with python-json-logger or structlog JSON output
        try:
            msg = json.loads(record.getMessage())
        except (json.JSONDecodeError, TypeError):
            msg = {'message': record.getMessage(), 'level': record.levelname.lower()}
        self.records.append(msg)

    def get(self, event=None, level=None):
        results = self.records
        if event:
            results = [r for r in results if r.get('event') == event]
        if level:
            results = [r for r in results if r.get('level') == level]
        return results


# Usage in tests
def test_with_json_log_capture():
    handler = JsonLogCapture()
    logger = logging.getLogger('myapp')
    logger.addHandler(handler)

    run_some_operation()

    logger.removeHandler(handler)

    assert len(handler.get(event='operation.completed')) == 1

Capturing Logs in Java / JUnit Tests

Java logging (Logback/SLF4J) supports appenders—equivalent to Python's handlers. Use an in-memory appender in tests.

Logback ListAppender

// src/test/java/helpers/LogCapture.java
import ch.qos.logback.classic.Logger;
import ch.qos.logback.classic.spi.ILoggingEvent;
import ch.qos.logback.core.read.ListAppender;
import org.slf4j.LoggerFactory;

public class LogCapture {
    private final ListAppender<ILoggingEvent> appender;
    private final Logger logger;

    public LogCapture(Class<?> clazz) {
        this.logger = (Logger) LoggerFactory.getLogger(clazz);
        this.appender = new ListAppender<>();
        this.appender.start();
        this.logger.addAppender(appender);
    }

    public List<ILoggingEvent> getEvents() {
        return appender.list;
    }

    public List<ILoggingEvent> getErrors() {
        return appender.list.stream()
            .filter(e -> e.getLevel() == Level.ERROR)
            .collect(Collectors.toList());
    }

    public void stop() {
        logger.detachAppender(appender);
        appender.stop();
    }
}

// src/test/java/PaymentServiceTest.java
class PaymentServiceTest {
    private LogCapture logCapture;
    private PaymentService paymentService;

    @BeforeEach
    void setUp() {
        paymentService = new PaymentService();
        logCapture = new LogCapture(PaymentService.class);
    }

    @AfterEach
    void tearDown() {
        logCapture.stop();
    }

    @Test
    void processPayment_emitsProcessedEvent() {
        paymentService.process(new PaymentRequest("u_123", 4900, "USD"));

        List<ILoggingEvent> events = logCapture.getEvents();
        ILoggingEvent processed = events.stream()
            .filter(e -> e.getMessage().contains("payment.processed"))
            .findFirst()
            .orElse(null);

        assertNotNull(processed, "Expected payment.processed log event");
        assertEquals(Level.INFO, processed.getLevel());

        Map<String, String> mdc = processed.getMDCPropertyMap();
        assertEquals("u_123", mdc.get("userId"));
        assertEquals("USD", mdc.get("currency"));
    }

    @Test
    void processPayment_noErrorsOnSuccess() {
        paymentService.process(new PaymentRequest("u_456", 1000, "EUR"));

        List<ILoggingEvent> errors = logCapture.getErrors();
        assertTrue(errors.isEmpty(),
            "Expected no ERROR logs on successful payment, got: " +
            errors.stream().map(ILoggingEvent::getFormattedMessage).collect(joining(", ")));
    }
}

Testing Log Levels and Context Fields

Log level discipline matters. An INFO log for a successful payment is fine. An ERROR log for a handled validation case creates false alert noise. An absent ERROR log when a database write fails hides real problems.

Test level assignments explicitly:

@pytest.mark.parametrize("scenario, expected_level", [
    ("valid_payment", "info"),
    ("card_declined", "warning"),
    ("db_connection_failed", "error"),
    ("invalid_amount_negative", "warning"),
])
def test_payment_log_levels(scenario, expected_level, log_capture):
    if scenario == "valid_payment":
        process_payment(amount=1000, currency="USD", user_id="u_1")
    elif scenario == "card_declined":
        process_payment(amount=1000, currency="USD", user_id="u_2",
                        simulate="card_declined")
    elif scenario == "db_connection_failed":
        with patch('myapp.db.get_connection', side_effect=ConnectionError):
            with pytest.raises(Exception):
                process_payment(amount=1000, currency="USD", user_id="u_3")
    elif scenario == "invalid_amount_negative":
        with pytest.raises(ValueError):
            process_payment(amount=-1, currency="USD", user_id="u_4")

    assert len(log_capture) > 0

    # Get the final (most specific) log for this operation
    relevant_log = log_capture[-1]
    assert relevant_log['level'] == expected_level, \
        f"Scenario '{scenario}': expected level '{expected_level}', got '{relevant_log['level']}'"

Test context fields—the structured fields that make logs queryable:

def test_payment_log_includes_required_context_fields(log_capture):
    process_payment(amount=4900, currency='USD', user_id='u_123',
                    request_id='req_abc')

    payment_log = next(
        (e for e in log_capture if e.get('event') == 'payment.processed'),
        None
    )
    assert payment_log is not None

    required_fields = ['user_id', 'amount', 'currency', 'duration_ms',
                       'request_id', 'provider']
    for field in required_fields:
        assert field in payment_log, \
            f"Missing required log field: {field}"
        assert payment_log[field] is not None, \
            f"Required log field '{field}' is None"

Asserting No Unexpected Errors During Integration Tests

The highest-value log test is the simplest: after running a complete integration test scenario, assert that no ERROR or CRITICAL level log events were emitted when none were expected.

# tests/conftest.py
@pytest.fixture(autouse=True)
def assert_no_unexpected_errors(log_capture, request):
    """After each test, fail if unexpected ERROR logs were emitted."""
    yield

    # Allow tests to opt out
    if request.node.get_closest_marker('allow_errors'):
        return

    error_logs = [
        e for e in log_capture
        if e.get('level') in ('error', 'critical')
    ]

    if error_logs:
        error_messages = '\n'.join(
            f"  [{e.get('level').upper()}] {e.get('event', e.get('message', '?'))}"
            for e in error_logs
        )
        pytest.fail(
            f"Unexpected ERROR logs during test '{request.node.name}':\n"
            f"{error_messages}"
        )


# Test that opts out explicitly:
@pytest.mark.allow_errors
def test_error_handling_produces_error_log(log_capture):
    with pytest.raises(ValueError):
        process_payment(amount=-1, currency='USD', user_id='u_1')

    error_logs = [e for e in log_capture if e.get('level') == 'error']
    assert len(error_logs) == 1

With this fixture as autouse=True, every test in your suite is implicitly checking for unexpected errors in the logs. A refactor that causes a previously silent DB error to start logging at ERROR level will break the relevant tests—even if those tests weren't written specifically to check for log output.

Log Schema Contracts with JSON Schema Validation

Downstream systems—log processors, Elasticsearch mappings, dashboards—depend on specific field names and types in your log output. JSON Schema validation enforces these contracts.

Define the schema for your log events:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "PaymentProcessedEvent",
  "type": "object",
  "required": ["level", "event", "user_id", "amount", "currency",
               "provider", "duration_ms", "timestamp"],
  "properties": {
    "level":       { "type": "string", "enum": ["info", "warning", "error"] },
    "event":       { "type": "string", "const": "payment.processed" },
    "user_id":     { "type": "string", "pattern": "^u_[a-z0-9]+$" },
    "amount":      { "type": "integer", "minimum": 1 },
    "currency":    { "type": "string", "enum": ["USD", "EUR", "GBP"] },
    "provider":    { "type": "string", "enum": ["stripe", "adyen"] },
    "duration_ms": { "type": "integer", "minimum": 0 },
    "timestamp":   { "type": "string", "format": "date-time" },
    "order_id":    { "type": "string" },
    "request_id":  { "type": "string" }
  },
  "additionalProperties": true
}

Validate log output against the schema in tests:

# tests/test_log_schema.py
import json
import jsonschema
import pathlib
import pytest

SCHEMAS = {
    'payment.processed': json.loads(
        pathlib.Path('schemas/logs/payment-processed.json').read_text()
    ),
    'order.created': json.loads(
        pathlib.Path('schemas/logs/order-created.json').read_text()
    ),
}

def validate_log_event(event: dict):
    event_type = event.get('event')
    if event_type not in SCHEMAS:
        return  # no schema defined, skip validation

    try:
        jsonschema.validate(instance=event, schema=SCHEMAS[event_type])
    except jsonschema.ValidationError as e:
        pytest.fail(
            f"Log event '{event_type}' failed schema validation:\n"
            f"  Field: {'.'.join(str(p) for p in e.absolute_path)}\n"
            f"  Error: {e.message}\n"
            f"  Event: {json.dumps(event, indent=2)}"
        )


def test_payment_processed_log_matches_schema(log_capture):
    process_payment(amount=4900, currency='USD', user_id='u_123')

    for event in log_capture:
        validate_log_event(event)


def test_log_schema_catches_field_rename():
    """Regression test: amount_cents was renamed to amount, breaking downstream."""
    bad_event = {
        "level": "info",
        "event": "payment.processed",
        "user_id": "u_123",
        "amount_cents": 4900,  # wrong field name
        "currency": "USD",
        "provider": "stripe",
        "duration_ms": 143,
        "timestamp": "2026-05-17T10:00:00Z"
    }

    with pytest.raises(pytest.fail.Exception):
        validate_log_event(bad_event)

Add schema validation to CI as a contract test step—separate from unit tests—to make it explicit that log schema changes are breaking changes:

# .github/workflows/log-contract-tests.yaml
- name: Run log schema contract tests
  run: pytest tests/test_log_schema.py -v --tb=short

Combining Log Tests with Integration Test Suites

Log assertions belong alongside functional assertions in integration tests—not in a separate file. A test that exercises the checkout flow should also assert the correct log events were emitted:

def test_full_checkout_flow(client, log_capture):
    # Functional assertions
    response = client.post('/api/checkout', json={
        'cart_id': 'cart_abc',
        'user_id': 'u_789',
        'payment_method': 'pm_test_visa'
    })
    assert response.status_code == 201
    data = response.json()
    assert 'order_id' in data

    # Log assertions — same test, same function call
    order_log = next(
        (e for e in log_capture if e.get('event') == 'order.created'),
        None
    )
    assert order_log is not None, "No order.created log emitted"
    assert order_log['order_id'] == data['order_id']
    assert order_log['user_id'] == 'u_789'
    assert order_log['level'] == 'info'

    payment_log = next(
        (e for e in log_capture if e.get('event') == 'payment.processed'),
        None
    )
    assert payment_log is not None, "No payment.processed log emitted"
    assert payment_log['amount'] > 0

    # No errors
    error_logs = [e for e in log_capture if e.get('level') in ('error', 'critical')]
    assert error_logs == [], f"Unexpected errors: {error_logs}"

This approach means your integration test suite simultaneously validates behavior, observability, and the absence of unexpected failures. Log regressions surface in the same test run as functional regressions—where they belong.

HelpMeTest can monitor your observability stack automatically — sign up free