API Synthetic Monitoring: Health Checks, Latency, and Response Validation
API synthetic monitoring is the most cost-effective monitoring you can do. No browser, no JavaScript, no rendering — just send an HTTP request and validate the response. A well-configured API monitor can run every 30 seconds for minimal cost and catch outages within a minute.
This guide covers the full spectrum: from simple ping-style health checks to sophisticated response validation and latency tracking.
Beyond the Status Code: What to Actually Check
The most common mistake in API monitoring is checking only the HTTP status code. A 200 OK from your API means the web server responded — it doesn't mean your application is working correctly.
Consider: your database goes down. Your API might still return 200 with an empty JSON body, or a {"status": "degraded"} response, or a partial response with missing fields. If you're only checking for status == 200, you'll miss this entirely.
What to check, in order of importance:
1. Status code — The baseline. Is the server responding at all?
2. Response time — Is it responding fast enough? A 200 OK in 45 seconds is effectively down.
3. Response body structure — Does the response contain the fields you expect?
4. Response body content — Do the values make sense? (Not just present, but valid)
5. Response headers — Is caching working? Is the correct content type returned?
Simple Health Check Endpoints
The best API monitoring starts with dedicated health check endpoints. If your API doesn't have one, add it — it's worth the 20 minutes.
A good health check endpoint:
// Express.js example
app.get('/health', async (req, res) => {
const checks = {};
let status = 'healthy';
// Check database
try {
await db.query('SELECT 1');
checks.database = 'ok';
} catch (err) {
checks.database = 'error';
status = 'degraded';
}
// Check Redis
try {
await redis.ping();
checks.cache = 'ok';
} catch (err) {
checks.cache = 'error';
status = 'degraded';
}
// Check external dependencies
try {
await fetch('https://api.stripe.com/v1/customers', {
method: 'GET',
headers: { 'Authorization': `Bearer ${process.env.STRIPE_KEY}` }
});
checks.payments = 'ok';
} catch (err) {
checks.payments = 'error';
// Note: not degrading status here — payment API outages are Stripe's problem
}
res.status(status === 'healthy' ? 200 : 503).json({
status,
checks,
timestamp: new Date().toISOString(),
version: process.env.APP_VERSION
});
});Key points:
- Return 503 when degraded, not 200. This lets monitors check status code.
- Include a timestamp — useful for debugging (is this a cached response?)
- Include version — instantly know which deploy is running when debugging
Monitor this with HelpMeTest:
helpmetest health api-health-check 30sThe 30-second grace period means you get alerted within 30 seconds of the health check failing to report in. This is appropriate for a real-time API where users feel downtime immediately.
Writing API Monitors with Response Validation
For more sophisticated checks, you need to validate the response body, not just the status code. Here's a structured approach:
import requests
import sys
import time
def monitor_api_endpoint():
"""Monitor the users API endpoint."""
start = time.time()
try:
response = requests.get(
'https://api.example.com/v1/users/me',
headers={
'Authorization': f'Bearer {os.environ["MONITOR_API_TOKEN"]}',
'Content-Type': 'application/json'
},
timeout=10
)
except requests.Timeout:
print("FAIL: Request timed out after 10 seconds")
sys.exit(1)
except requests.ConnectionError as e:
print(f"FAIL: Connection error — {e}")
sys.exit(1)
latency = (time.time() - start) * 1000 # in milliseconds
# Check status code
if response.status_code != 200:
print(f"FAIL: Expected 200, got {response.status_code}")
print(f"Body: {response.text[:500]}")
sys.exit(1)
# Check latency
if latency > 2000:
print(f"FAIL: Response too slow — {latency:.0f}ms (threshold: 2000ms)")
sys.exit(1)
# Parse and validate body
try:
data = response.json()
except ValueError:
print(f"FAIL: Response is not valid JSON")
print(f"Body: {response.text[:500]}")
sys.exit(1)
# Validate required fields
required_fields = ['id', 'email', 'created_at']
missing = [f for f in required_fields if f not in data]
if missing:
print(f"FAIL: Missing required fields: {missing}")
sys.exit(1)
# Validate field types/formats
if not isinstance(data['id'], str) or not data['id']:
print(f"FAIL: 'id' field is invalid: {data['id']!r}")
sys.exit(1)
print(f"PASS: {latency:.0f}ms — all fields present and valid")
monitor_api_endpoint()This script:
- Handles connection errors explicitly (not just HTTP errors)
- Measures actual latency from the client's perspective
- Validates JSON structure, not just status code
- Gives specific error messages for each failure mode
Monitoring REST API CRUD Operations
For critical API paths, don't just monitor reads — monitor writes too. A broken write path can be more damaging than a broken read path.
Here's a monitor for a complete CRUD cycle:
import requests
import sys
BASE_URL = 'https://api.example.com/v1'
HEADERS = {'Authorization': f'Bearer {os.environ["MONITOR_API_TOKEN"]}'}
def monitor_crud_cycle():
created_id = None
try:
# CREATE
create_response = requests.post(
f'{BASE_URL}/items',
json={'name': 'synthetic-monitor-test', 'type': 'monitor'},
headers=HEADERS,
timeout=10
)
assert create_response.status_code == 201, \
f"Create failed: {create_response.status_code} — {create_response.text}"
created_id = create_response.json()['id']
# READ
read_response = requests.get(
f'{BASE_URL}/items/{created_id}',
headers=HEADERS,
timeout=10
)
assert read_response.status_code == 200, \
f"Read failed: {read_response.status_code}"
assert read_response.json()['name'] == 'synthetic-monitor-test', \
"Read returned wrong item name"
# UPDATE
update_response = requests.patch(
f'{BASE_URL}/items/{created_id}',
json={'name': 'synthetic-monitor-test-updated'},
headers=HEADERS,
timeout=10
)
assert update_response.status_code == 200, \
f"Update failed: {update_response.status_code}"
print(f"PASS: Created {created_id}, read, updated successfully")
except AssertionError as e:
print(f"FAIL: {e}")
sys.exit(1)
finally:
# Always clean up — don't leave test data in production
if created_id:
requests.delete(f'{BASE_URL}/items/{created_id}', headers=HEADERS, timeout=10)
monitor_crud_cycle()The finally block is critical. Your synthetic monitor creates real data in your production database on every run. Without cleanup, you'll accumulate thousands of synthetic-monitor-test records. Always clean up after yourself.
Authentication Monitoring Patterns
How you handle authentication in API monitors matters. You have three options:
Option 1: Dedicated monitor API key (best) Create an API key specifically for your monitors. Set appropriate rate limits. If it's compromised, revoke it without affecting real users. This is the approach shown in the examples above.
Option 2: Test user credentials For APIs that use user auth tokens, create a dedicated "monitor" user account. Generate a long-lived token for this account. Use it in monitors.
Option 3: Token refresh in monitor If your API uses short-lived tokens, your monitor needs to refresh them:
def get_access_token():
response = requests.post(
'https://auth.example.com/oauth/token',
json={
'grant_type': 'client_credentials',
'client_id': os.environ['MONITOR_CLIENT_ID'],
'client_secret': os.environ['MONITOR_CLIENT_SECRET']
},
timeout=10
)
response.raise_for_status()
return response.json()['access_token']Cache the token and refresh it when expired. Don't fetch a new token on every monitor run — that adds latency and unnecessary load to your auth service.
Latency Baselines and Percentile Monitoring
Single-point latency measurements lie. An API that responds in 150ms 95% of the time but 8000ms 5% of the time is a problem — but your average latency might look fine.
When running monitors frequently, track percentiles:
import statistics
latencies = [] # collected over time
def log_latency(ms):
latencies.append(ms)
if len(latencies) >= 100:
p50 = statistics.median(latencies)
p95 = sorted(latencies)[int(len(latencies) * 0.95)]
p99 = sorted(latencies)[int(len(latencies) * 0.99)]
print(f"Latency p50={p50:.0f}ms p95={p95:.0f}ms p99={p99:.0f}ms")
latencies.clear()Most monitoring platforms do this for you — but knowing which percentile to alert on matters. p50 alerts catch systemic slowdowns. p95 alerts catch tail latency issues that affect many users. p99 alerts catch extreme outliers that affect a small percentage of users but represent the worst experiences.
A reasonable latency alerting strategy:
- Alert on p95 > 3× baseline (consistent tail latency problem)
- Alert on p50 > 2× baseline (overall slowdown)
- Alert on any single request > 30 seconds (catastrophic failure)
API Monitoring with HelpMeTest
In HelpMeTest, API monitors run as Robot Framework tests. The pattern is the same — make a request, validate the response — but integrated into the scheduling and alerting infrastructure:
*** Settings ***
Library RequestsLibrary
Library Collections
*** Variables ***
${BASE_URL} https://api.example.com/v1
${API_KEY} %{MONITOR_API_TOKEN}
*** Test Cases ***
Monitor Users API
Create Session api ${BASE_URL} headers={"Authorization": "Bearer ${API_KEY}"}
${response}= GET On Session api /users/me expected_status=200
${latency}= Evaluate ${response.elapsed.total_seconds()} * 1000
Should Be True ${latency} < 2000
... msg=API too slow: ${latency:.0f}ms (threshold: 2000ms)
${body}= Set Variable ${response.json()}
Dictionary Should Contain Key ${body} id
Dictionary Should Contain Key ${body} email
Dictionary Should Contain Key ${body} created_at
Monitor API Health Endpoint
Create Session api ${BASE_URL}
${response}= GET On Session api /health expected_status=200
${body}= Set Variable ${response.json()}
Should Be Equal ${body}[status] healthy
... msg=API health degraded: ${body}[checks]Run this every minute for critical APIs. HelpMeTest handles scheduling, alerting, and storing the pass/fail history so you can see when a degradation started.
What to Monitor: A Checklist
For each API you care about:
- Health check endpoint (if it exists) — status and body validation
- Authentication endpoint — can tokens be obtained?
- Most-called GET endpoint — latency and response structure
- Most critical write endpoint — can data be created?
- Any third-party API integrations your API depends on
- Rate limit headers — are you approaching limits?
Start with the health check and the most critical read endpoint. Add write monitoring once those are working reliably. The first five monitors you set up will catch 80% of your production issues.