Observability-Driven Testing: Shifting Left with Traces and Metrics
The idea: if your system emits traces and metrics in production, those same signals can power your tests in development. Observability-driven testing (ODT) treats telemetry data as first-class test assertions.
The Problem with Classic Integration Tests
A typical integration test:
test('order placement succeeds', async () => {
const res = await fetch('/orders', { method: 'POST', body: JSON.stringify(order) });
expect(res.status).toBe(201);
const body = await res.json();
expect(body.orderId).toBeDefined();
});This test verifies the interface — the HTTP response. It says nothing about:
- Did inventory actually reserve the stock?
- Did payment call the right gateway?
- Were database writes transactional?
- Is this 100ms slower than last week?
Observability-driven testing adds those assertions.
What ODT Asserts On
| Signal | Classic Test | ODT Adds |
|---|---|---|
| HTTP response | ✓ Status code, body | — |
| Spans | ✗ | ✓ All services participated, no errors |
| Metrics | ✗ | ✓ Counters incremented, latency within SLO |
| Logs | ✗ | ✓ No ERROR lines, expected audit events emitted |
The ODT Testing Loop
1. Instrument your app (OpenTelemetry SDK)
2. Run a test collector in CI (Jaeger, Prometheus, Loki)
3. Execute the test scenario (HTTP, browser, CLI)
4. Wait for async export
5. Assert on telemetry — not just the API response
6. Fail the test if signals are wrongSetting Up ODT in CI
# .github/workflows/integration.yml
services:
jaeger:
image: jaegertracing/all-in-one:latest
env:
COLLECTOR_OTLP_ENABLED: "true"
ports:
- 16686:16686
- 4318:4318
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus-test.yml:/etc/prometheus/prometheus.yml
ports:
- 9090:9090# prometheus-test.yml
global:
scrape_interval: 5s
scrape_configs:
- job_name: app
static_configs:
- targets: ['app:9464']Trace Assertions
const { jaeger } = require('./test-utils/jaeger');
test('order flow — trace must show all services, no errors', async () => {
const startMs = Date.now();
await placeOrder({ sku: 'A', qty: 1 });
await sleep(800);
const trace = await jaeger.findLatestTrace('order-service', 'POST /orders', startMs);
const services = trace.serviceNames();
expect(services).toEqual(
expect.arrayContaining(['order-service', 'inventory-service', 'payment-service'])
);
expect(trace.errorSpans()).toHaveLength(0);
expect(trace.rootSpanDurationMs()).toBeLessThan(500);
});Metrics Assertions with Prometheus
const { prometheus } = require('./test-utils/prometheus');
test('successful order increments order counter', async () => {
const before = await prometheus.queryInstant('orders_total{status="success"}');
await placeOrder({ sku: 'A', qty: 1 });
await sleep(1000); // allow scrape interval
const after = await prometheus.queryInstant('orders_total{status="success"}');
expect(after - before).toBe(1);
});
test('payment failure increments failure counter', async () => {
mockPayment.rejectNext('declined');
const before = await prometheus.queryInstant('payment_failures_total');
await placeOrder({ sku: 'A', qty: 1 });
await sleep(1000);
const after = await prometheus.queryInstant('payment_failures_total');
expect(after - before).toBe(1);
});A simple Prometheus query helper:
// test-utils/prometheus.js
const PROM_API = 'http://localhost:9090/api/v1';
async function queryInstant(promql) {
const url = `${PROM_API}/query?query=${encodeURIComponent(promql)}`;
const res = await fetch(url);
const json = await res.json();
const result = json.data.result[0];
return result ? parseFloat(result.value[1]) : 0;
}
module.exports = { prometheus: { queryInstant } };Log Assertions
const { loki } = require('./test-utils/loki');
test('order audit event is logged', async () => {
const startMs = Date.now();
await placeOrder({ userId: 'u1', sku: 'A', qty: 1 });
await sleep(500);
const logs = await loki.queryRange(
`{service="order-service"} |= "ORDER_PLACED" | json`,
startMs,
Date.now()
);
expect(logs).toHaveLength(1);
expect(logs[0].fields.userId).toBe('u1');
expect(logs[0].fields.sku).toBe('A');
});
test('no ERROR logs on successful order', async () => {
const startMs = Date.now();
await placeOrder({ userId: 'u1', sku: 'A', qty: 1 });
await sleep(500);
const errorLogs = await loki.queryRange(
`{service="order-service"} |= "ERROR"`,
startMs,
Date.now()
);
expect(errorLogs).toHaveLength(0);
});ODT for Performance Regression Detection
This is where ODT pays for itself: catching slowdowns before production.
const LATENCY_SLO_MS = {
'order-service': 200,
'inventory-service': 50,
'payment-service': 300,
};
test('all services within latency SLO', async () => {
const startMs = Date.now();
await placeOrder({ sku: 'A', qty: 1 });
await sleep(800);
const trace = await jaeger.findLatestTrace('order-service', 'POST /orders', startMs);
const spans = trace.spans();
for (const [service, sloMs] of Object.entries(LATENCY_SLO_MS)) {
const serviceSpans = spans.filter(s => s.service === service);
serviceSpans.forEach(span => {
expect(span.durationMs).toBeLessThan(sloMs);
});
}
});Run this test in CI on every PR. When a PR introduces a slow database query, this test catches it before merge.
ODT for Contract Compliance
Verify that semantic conventions are respected:
test('HTTP spans follow OTel semantic conventions', async () => {
const startMs = Date.now();
await placeOrder({ sku: 'A', qty: 1 });
await sleep(800);
const trace = await jaeger.findLatestTrace('order-service', 'POST /orders', startMs);
const httpSpans = trace.spans().filter(s => s.tags['http.method']);
httpSpans.forEach(span => {
// Semantic convention: http spans must have these attributes
expect(span.tags['http.method']).toBeDefined();
expect(span.tags['http.status_code']).toBeDefined();
expect(span.tags['http.url'] || span.tags['http.route']).toBeDefined();
expect(span.tags['net.peer.name'] || span.tags['server.address']).toBeDefined();
});
});Structuring ODT Tests
Separate ODT assertions into their own describe block:
describe('order placement — behavioral', () => {
test('returns 201 with orderId', async () => { /* ... */ });
test('returns 400 for invalid sku', async () => { /* ... */ });
});
describe('order placement — observability', () => {
test('emits complete trace across all services', async () => { /* ... */ });
test('increments order counter', async () => { /* ... */ });
test('logs ORDER_PLACED audit event', async () => { /* ... */ });
test('all spans within latency SLO', async () => { /* ... */ });
});This keeps classical and observability assertions readable and independently runnable.
Running ODT in HelpMeTest
HelpMeTest E2E scenarios trigger real traces. By pointing HelpMeTest's test environment at your Jaeger + Prometheus stack, you can run behavioral scenarios and telemetry assertions in the same CI pipeline. This validates both "the user can place an order" and "the distributed execution behind that order is correct and fast."
Summary
Observability-driven testing shifts left the signals you'd normally only see in production:
- Trace assertions — all services participated, no error spans, spans within SLO
- Metrics assertions — counters incremented correctly, histograms within bounds
- Log assertions — audit events emitted, no unexpected ERROR lines
The tooling is lightweight: one Jaeger container, one Prometheus container, and a handful of query helper functions. The payoff is enormous: distributed failures caught in CI, before production, before customers.