Distributed Tracing in Tests: Using OpenTelemetry and Jaeger to Debug Microservices
When a microservices integration test fails with a vague 500 error, you have a problem: the error came from somewhere in a chain of four services, passed through an API gateway, and touched a database and a message queue. The stack trace in your test output tells you where the request entered the system. It tells you nothing about where it broke.
Distributed tracing solves this. When you instrument your services with OpenTelemetry and collect traces in Jaeger, every test run produces a detailed map of what happened across every service, in what order, with what data, and where time was spent. Test failures go from "something returned 500" to "the inventory service's database query timed out after 4.2 seconds because a missing index caused a full table scan."
This guide covers how to instrument services with OpenTelemetry, how to assert trace structure in tests, and how to use Jaeger as a debugging tool when tests fail.
Why Tracing Changes How You Test
Traditional test assertions are binary: the request succeeded or it didn't, the response body matched or it didn't. Tracing adds a third axis — how the operation executed. This matters for:
- Debugging flaky tests — A test that fails 1 in 20 runs is often caused by a race condition or intermittent downstream dependency. The trace shows exactly which call was slow or failed.
- Performance regression detection — Assert that the span for a database query is under 50ms. If a code change doubles query time, the trace-based assertion catches it before it ships.
- Verifying architectural constraints — Assert that service A never calls service C directly — it must go through service B. Trace structure makes these architectural rules testable.
- Debugging in staging — When an E2E test fails in a CI pipeline, the trace ID in the test output lets you pull up the full distributed trace in Jaeger instantly, without needing to reproduce locally.
Instrumenting Services with OpenTelemetry
OpenTelemetry provides a vendor-neutral instrumentation API. Here's how to add it to a Node.js microservice:
npm install @opentelemetry/sdk-node \
@opentelemetry/auto-instrumentations-node \
@opentelemetry/exporter-trace-otlp-http \
@opentelemetry/resources \
@opentelemetry/semantic-conventionsCreate a tracer setup file that runs before everything else:
// tracing.js — load this before any other imports
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');
const exporter = new OTLPTraceExporter({
url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT || 'http://jaeger:4318/v1/traces',
});
const sdk = new NodeSDK({
resource: new Resource({
[SemanticResourceAttributes.SERVICE_NAME]: process.env.SERVICE_NAME || 'unknown-service',
[SemanticResourceAttributes.SERVICE_VERSION]: process.env.SERVICE_VERSION || '0.0.0',
}),
traceExporter: exporter,
instrumentations: [
getNodeAutoInstrumentations({
'@opentelemetry/instrumentation-http': { enabled: true },
'@opentelemetry/instrumentation-express': { enabled: true },
'@opentelemetry/instrumentation-pg': { enabled: true },
'@opentelemetry/instrumentation-redis': { enabled: true },
}),
],
});
sdk.start();
process.on('SIGTERM', () => {
sdk.shutdown()
.then(() => console.log('Tracing terminated'))
.catch((error) => console.error('Error terminating tracing', error));
});Load it at startup: node -r ./tracing.js server.js
For custom spans on business-critical operations:
const { trace, SpanStatusCode } = require('@opentelemetry/api');
const tracer = trace.getTracer('order-service', '1.0.0');
async function processOrder(orderId, customerId) {
// Start a custom span for this business operation
return tracer.startActiveSpan('processOrder', async (span) => {
span.setAttribute('order.id', orderId);
span.setAttribute('customer.id', customerId);
try {
const order = await orderRepository.findById(orderId);
// Nested span for the payment step
const paymentResult = await tracer.startActiveSpan(
'chargeCustomer',
async (paymentSpan) => {
paymentSpan.setAttribute('payment.amount', order.totalAmount);
paymentSpan.setAttribute('payment.currency', 'USD');
try {
const result = await paymentService.charge(customerId, order.totalAmount);
paymentSpan.setAttribute('payment.transaction_id', result.transactionId);
return result;
} catch (err) {
paymentSpan.setStatus({ code: SpanStatusCode.ERROR, message: err.message });
paymentSpan.recordException(err);
throw err;
} finally {
paymentSpan.end();
}
}
);
span.setAttribute('order.status', 'completed');
return { success: true, transactionId: paymentResult.transactionId };
} catch (err) {
span.setStatus({ code: SpanStatusCode.ERROR, message: err.message });
span.recordException(err);
throw err;
} finally {
span.end();
}
});
}Running Jaeger in Your Test Environment
For local development and CI, run Jaeger all-in-one (stores traces in memory — fine for testing):
# docker-compose.test.yml
services:
jaeger:
image: jaegertracing/all-in-one:latest
ports:
- "6831:6831/udp" # Thrift compact (legacy agents)
- "16686:16686" # UI
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP
environment:
COLLECTOR_OTLP_ENABLED: "true"
SPAN_STORAGE_TYPE: memory
MEMORY_MAX_TRACES: 10000
order-service:
build: ./order-service
environment:
SERVICE_NAME: order-service
OTEL_EXPORTER_OTLP_ENDPOINT: http://jaeger:4318/v1/traces
depends_on:
- jaegerQuerying Traces in Tests
Jaeger exposes a REST API for querying traces. Use it in your tests to assert on trace structure after triggering operations:
# trace_assertions.py
import requests
import time
from typing import List, Dict, Optional
JAEGER_API = "http://localhost:16686/api"
class TraceAssertor:
def get_traces(
self,
service: str,
operation: Optional[str] = None,
tags: Optional[Dict[str, str]] = None,
lookback: str = "1m",
limit: int = 10
) -> List[Dict]:
"""Fetch recent traces from Jaeger."""
params = {
"service": service,
"limit": limit,
"lookback": lookback,
}
if operation:
params["operation"] = operation
if tags:
params["tags"] = str(tags).replace("'", '"')
resp = requests.get(f"{JAEGER_API}/traces", params=params)
resp.raise_for_status()
return resp.json().get("data", [])
def get_spans_by_operation(self, trace: Dict, operation: str) -> List[Dict]:
"""Get all spans in a trace matching an operation name."""
return [
span for span in trace["spans"]
if span["operationName"] == operation
]
def get_span_tags(self, span: Dict) -> Dict[str, str]:
"""Extract tags from a span as a simple dict."""
return {tag["key"]: tag["value"] for tag in span.get("tags", [])}
def assert_span_exists(self, trace: Dict, operation: str) -> Dict:
"""Assert that a span with the given operation exists and return it."""
spans = self.get_spans_by_operation(trace, operation)
assert spans, f"Expected span '{operation}' not found in trace. Found: {[s['operationName'] for s in trace['spans']]}"
return spans[0]
def assert_no_error_spans(self, trace: Dict) -> None:
"""Assert that no spans in the trace have error status."""
error_spans = [
span for span in trace["spans"]
if any(
tag["key"] == "otel.status_code" and tag["value"] == "ERROR"
for tag in span.get("tags", [])
)
]
if error_spans:
ops = [s["operationName"] for s in error_spans]
assert False, f"Found error spans: {ops}"Now write tests that use these assertions:
# test_order_processing_traces.py
import pytest
import requests
import time
from trace_assertions import TraceAssertor
GATEWAY_URL = "http://localhost:8080"
assertor = TraceAssertor()
class TestOrderProcessingTraces:
def test_successful_order_creates_expected_trace_structure(self):
"""
When an order is placed successfully, the trace should show:
1. The HTTP POST span from the API gateway
2. The processOrder business span
3. The chargeCustomer span
4. A database write span
No spans should have error status.
"""
# Trigger the operation
resp = requests.post(
f"{GATEWAY_URL}/api/orders",
json={"customerId": "cust_123", "items": [{"sku": "PROD-1", "quantity": 1}]},
headers={"X-Trace-Test": "test_successful_order"}
)
assert resp.status_code == 201
time.sleep(2) # Allow traces to propagate to Jaeger
traces = assertor.get_traces(
service="order-service",
operation="POST /orders",
lookback="30s",
limit=5
)
assert traces, "No traces found for order creation"
trace = traces[0]
# Verify business span exists
process_span = assertor.assert_span_exists(trace, "processOrder")
span_tags = assertor.get_span_tags(process_span)
assert "order.id" in span_tags, "processOrder span missing order.id attribute"
assert "customer.id" in span_tags
# Verify payment span is nested
assertor.assert_span_exists(trace, "chargeCustomer")
# Verify database write occurred
assertor.assert_span_exists(trace, "pg.query")
# No errors
assertor.assert_no_error_spans(trace)
def test_payment_failure_is_captured_in_trace(self):
"""
When payment fails, the chargeCustomer span should have error status
and an exception event, making root cause immediately visible.
"""
resp = requests.post(
f"{GATEWAY_URL}/api/orders",
json={
"customerId": "cust_invalid_card",
"items": [{"sku": "PROD-1", "quantity": 1}]
}
)
assert resp.status_code in [402, 500]
time.sleep(2)
traces = assertor.get_traces(
service="order-service",
operation="POST /orders",
lookback="30s"
)
assert traces
trace = traces[0]
payment_spans = assertor.get_spans_by_operation(trace, "chargeCustomer")
assert payment_spans, "chargeCustomer span should exist even on failure"
payment_span = payment_spans[0]
tags = assertor.get_span_tags(payment_span)
assert tags.get("otel.status_code") == "ERROR", (
"chargeCustomer span should have ERROR status on payment failure"
)
# Verify exception event was recorded
events = payment_span.get("logs", []) # Jaeger calls these "logs"
exception_events = [
e for e in events
if any(f["key"] == "event" and "exception" in str(f["value"]).lower()
for f in e.get("fields", []))
]
assert exception_events, "Expected exception event on payment failure span"
def test_span_duration_within_sla(self):
"""
The processOrder span should complete within 2 seconds under normal conditions.
Catches performance regressions before they reach production.
"""
requests.post(
f"{GATEWAY_URL}/api/orders",
json={"customerId": "cust_123", "items": [{"sku": "PROD-1", "quantity": 1}]}
)
time.sleep(2)
traces = assertor.get_traces(service="order-service", operation="POST /orders", lookback="30s")
assert traces
trace = traces[0]
process_span = assertor.assert_span_exists(trace, "processOrder")
# Duration in microseconds in Jaeger
duration_ms = process_span["duration"] / 1000
assert duration_ms < 2000, (
f"processOrder took {duration_ms:.0f}ms, expected < 2000ms. "
f"Check for performance regression."
)Trace-Based Testing Patterns
Beyond debugging, traces enable testing patterns that aren't possible with pure HTTP assertions:
Architectural constraint testing — Verify that service B is the only gateway to the database, and service A never writes directly:
def test_frontend_service_never_calls_database_directly():
"""
Frontend requests must go through order-service, never hit the database directly.
This enforces the architectural boundary.
"""
requests.get(f"{GATEWAY_URL}/api/orders?customerId=cust_123")
time.sleep(2)
traces = assertor.get_traces(service="frontend", lookback="30s")
for trace in traces:
for span in trace["spans"]:
# Frontend spans should not include database operations
assert "pg.query" not in span["operationName"], (
f"Frontend service made direct database call: {span['operationName']}"
)
assert "SELECT" not in span["operationName"].upper()Fan-out verification — When one request should trigger multiple downstream calls:
def test_order_creation_notifies_all_downstream_services():
"""
Creating an order should trigger calls to inventory, payment, and notification services.
Missing a downstream call indicates a bug in the orchestration logic.
"""
requests.post(f"{GATEWAY_URL}/api/orders", json=order_payload)
time.sleep(3)
traces = assertor.get_traces(service="order-service", lookback="30s")
assert traces
trace = traces[0]
services_called = {
span["processID"] for span in trace["spans"]
}
required_services = {"inventory-service", "payment-service", "notification-service"}
missing = required_services - services_called
assert not missing, f"Order creation did not call: {missing}"Integrating Trace Assertions into CI
For CI pipelines, print the Jaeger trace URL in test failure output so developers can immediately open it:
# conftest.py
import pytest
JAEGER_UI = "http://your-jaeger-host:16686"
@pytest.hookimpl(hookwrapper=True)
def pytest_runtest_makereport(item, call):
outcome = yield
report = outcome.get_result()
if report.when == "call" and report.failed:
# Try to get the trace ID if the test stored it
trace_id = getattr(item, "_trace_id", None)
if trace_id:
report.sections.append((
"Jaeger Trace",
f"View trace: {JAEGER_UI}/trace/{trace_id}"
))Store the trace ID in tests that trigger HTTP calls:
def test_order_creation(request):
resp = requests.post(f"{GATEWAY_URL}/api/orders", json=order_payload)
# Save trace ID from response header for failure reporting
trace_id = resp.headers.get("X-Trace-Id")
if trace_id:
request.node._trace_id = trace_id
assert resp.status_code == 201Getting the Most Out of Trace-Based Testing
The key discipline is correlating test runs to traces. Pass a unique test identifier as a custom header in every request (X-Test-Run-ID), and add it as a span attribute. Then you can filter Jaeger for exactly the spans generated by a specific test run — essential in shared test environments where multiple tests run concurrently.
Start by instrumenting one critical path end-to-end: frontend → API gateway → core service → database. Run your existing integration tests and look at the traces they produce. You'll immediately see things you didn't know were happening — unexpected service calls, unexpectedly slow queries, calls that succeed but take 10x longer than they should.
Traces don't replace test assertions — they complement them. When a test fails, the assertion tells you what went wrong; the trace tells you why. Together, they turn microservices debugging from an art into a process.