API Gateway Testing Guide: Rate Limiting, Auth, Routing, and Observability

API Gateway Testing Guide: Rate Limiting, Auth, Routing, and Observability

An API gateway is one of the most critical components in a modern backend architecture. It handles authentication, rate limiting, routing, request transformation, and more — all before a request reaches your application code. When the gateway misbehaves, everything downstream breaks. Testing it thoroughly is non-negotiable.

This guide covers the full testing spectrum for API gateways: functional correctness, rate limiting behavior, authentication enforcement, routing logic, and observability. The concepts apply regardless of which gateway you use — Kong, AWS API Gateway, Tyk, Envoy, or NGINX.

What to Test in an API Gateway

API gateways fail in predictable ways. Focus your test effort on:

  • Rate limiting — limits enforced correctly per consumer, per IP, per route
  • Authentication — invalid tokens rejected, valid tokens pass, consumer headers propagated
  • Routing — correct service selected for each path/method/host combination
  • Middleware/plugin ordering — auth runs before rate limiting, transformations applied correctly
  • Error responses — correct HTTP status codes and response bodies for each failure mode
  • Health checks — upstream health detection and circuit breaking
  • Observability — metrics exported, logs structured correctly, tracing headers propagated

Test Levels

Level 1: Configuration Validation

Before any request hits the gateway, validate the configuration itself.

# Generic config validation pattern
gateway-cli validate --config gateway.yaml

<span class="hljs-comment"># Check for common issues
gateway-cli lint --config gateway.yaml --rules security,rate-limits,auth

Configuration tests should run in CI on every pull request that touches gateway config. They're fast, need no running infrastructure, and catch typos and logic errors early.

Level 2: Integration Tests

Spin up the gateway with a test configuration and send real HTTP requests. These catch issues that configuration validation misses: plugin interactions, runtime behavior under load, and edge cases in request handling.

# pytest-based gateway integration test
import pytest
import httpx

GATEWAY_URL = "http://localhost:8000"

@pytest.fixture
def client():
    return httpx.Client(base_url=GATEWAY_URL)

def test_unauthenticated_request_rejected(client):
    response = client.get("/api/users")
    assert response.status_code == 401

def test_valid_api_key_passes(client):
    response = client.get("/api/users", headers={"X-API-Key": "test-key-123"})
    assert response.status_code == 200

def test_rate_limit_header_present(client):
    response = client.get("/api/users", headers={"X-API-Key": "test-key-123"})
    assert "X-RateLimit-Limit" in response.headers
    assert "X-RateLimit-Remaining" in response.headers

Level 3: Contract Tests

Verify the gateway preserves request/response contracts between clients and upstream services. The gateway should not silently modify payloads, strip required headers, or change response shapes.

def test_request_id_header_propagated(client):
    """Gateway must forward X-Request-ID to upstream."""
    request_id = "test-req-abc123"
    response = client.get(
        "/api/echo-headers",
        headers={"X-API-Key": "test-key", "X-Request-ID": request_id}
    )
    # The echo service returns the headers it received
    received_headers = response.json()["headers"]
    assert received_headers.get("X-Request-Id") == request_id

def test_upstream_error_not_swallowed(client):
    """Gateway should pass through 500s from upstream, not hide them."""
    response = client.get(
        "/api/force-error",
        headers={"X-API-Key": "test-key"}
    )
    assert response.status_code == 500
    assert "error" in response.json()

Rate Limiting Tests

Rate limiting has more edge cases than most engineers expect. Test all of them.

Basic Enforcement

import time

def test_rate_limit_enforced(client):
    """After N requests, next request returns 429."""
    api_key = "rate-test-key"
    headers = {"X-API-Key": api_key}
    
    # Make requests up to the limit (assume limit is 5/minute in test config)
    for i in range(5):
        r = client.get("/api/limited", headers=headers)
        assert r.status_code == 200, f"Request {i+1} failed unexpectedly"
    
    # 6th request should be blocked
    r = client.get("/api/limited", headers=headers)
    assert r.status_code == 429
    assert "Retry-After" in r.headers

def test_rate_limit_resets_after_window(client):
    """Rate limit counter resets after the window expires."""
    headers = {"X-API-Key": "reset-test-key"}
    
    # Exhaust limit
    for _ in range(5):
        client.get("/api/limited", headers=headers)
    
    # Verify blocked
    assert client.get("/api/limited", headers=headers).status_code == 429
    
    # Wait for window reset (use short window in test config)
    time.sleep(61)
    
    # Should work again
    assert client.get("/api/limited", headers=headers).status_code == 200

Per-Consumer vs Per-IP

A common misconfiguration: rate limits applied globally instead of per-consumer, so one heavy user blocks all others.

def test_rate_limits_are_per_consumer_not_global(client):
    """Consumer A hitting their limit should not affect Consumer B."""
    key_a = "consumer-a-key"
    key_b = "consumer-b-key"
    
    # Exhaust Consumer A's limit
    for _ in range(5):
        client.get("/api/limited", headers={"X-API-Key": key_a})
    
    # Consumer A is blocked
    assert client.get("/api/limited", headers={"X-API-Key": key_a}).status_code == 429
    
    # Consumer B is unaffected
    assert client.get("/api/limited", headers={"X-API-Key": key_b}).status_code == 200

Rate Limit Headers

Clients depend on rate limit headers to implement backoff. Test they're correct.

def test_rate_limit_headers_accurate(client):
    headers = {"X-API-Key": "header-test-key"}
    
    # First request: should show full limit remaining
    r1 = client.get("/api/limited", headers=headers)
    limit = int(r1.headers["X-RateLimit-Limit-Minute"])
    remaining_1 = int(r1.headers["X-RateLimit-Remaining-Minute"])
    assert remaining_1 == limit - 1
    
    # Second request: remaining should decrement
    r2 = client.get("/api/limited", headers=headers)
    remaining_2 = int(r2.headers["X-RateLimit-Remaining-Minute"])
    assert remaining_2 == remaining_1 - 1

Authentication Testing

Token Validation

import jwt
from datetime import datetime, timedelta

def make_jwt(secret, payload=None, expired=False):
    base = {
        "sub": "test-user",
        "iss": "https://auth.example.com",
        "exp": datetime.utcnow() + (timedelta(hours=-1) if expired else timedelta(hours=1))
    }
    return jwt.encode({**base, **(payload or {})}, secret, algorithm="HS256")

def test_valid_jwt_passes(client):
    token = make_jwt("test-secret")
    r = client.get("/api/protected", headers={"Authorization": f"Bearer {token}"})
    assert r.status_code == 200

def test_expired_jwt_rejected(client):
    token = make_jwt("test-secret", expired=True)
    r = client.get("/api/protected", headers={"Authorization": f"Bearer {token}"})
    assert r.status_code == 401
    assert "expired" in r.json().get("message", "").lower()

def test_tampered_jwt_rejected(client):
    token = make_jwt("test-secret")
    # Tamper with the payload (change the signature)
    parts = token.split(".")
    tampered = parts[0] + "." + parts[1] + ".badsignature"
    r = client.get("/api/protected", headers={"Authorization": f"Bearer {tampered}"})
    assert r.status_code == 401

def test_wrong_issuer_rejected(client):
    token = make_jwt("test-secret", payload={"iss": "https://evil.example.com"})
    r = client.get("/api/protected", headers={"Authorization": f"Bearer {token}"})
    assert r.status_code == 401

def test_missing_auth_header_rejected(client):
    r = client.get("/api/protected")
    assert r.status_code == 401
    assert "WWW-Authenticate" in r.headers

Consumer Identity Propagation

After authentication, the gateway typically adds consumer identity headers for upstream services. Verify these are set correctly.

def test_consumer_headers_set_after_auth(client):
    token = make_jwt("test-secret", payload={"sub": "user-123"})
    r = client.get("/api/echo-headers", headers={"Authorization": f"Bearer {token}"})
    
    received = r.json()["headers"]
    # Gateway should add these for upstream
    assert received.get("X-Consumer-Id") == "user-123"
    assert received.get("X-Authenticated-Scope") is not None

Routing Tests

Routing errors are silent and dangerous — the wrong service gets the request, no error is returned.

@pytest.mark.parametrize("path,expected_service", [
    ("/api/v1/users", "user-service"),
    ("/api/v1/orders", "order-service"),
    ("/api/v2/users", "user-service-v2"),
    ("/internal/metrics", "metrics-service"),
])
def test_routing_by_path(client, path, expected_service):
    """Each path routes to the correct upstream service."""
    r = client.get(path, headers={"X-API-Key": "test-key"})
    # Echo service returns which upstream handled it
    assert r.json()["service"] == expected_service

def test_method_routing(client):
    """GET and POST to same path route differently if configured."""
    get_r = client.get("/api/data", headers={"X-API-Key": "test-key"})
    post_r = client.post("/api/data", json={}, headers={"X-API-Key": "test-key"})
    
    assert get_r.json()["method"] == "GET"
    assert post_r.json()["method"] == "POST"
    # Both should succeed — method routing is preserved
    assert get_r.status_code == 200
    assert post_r.status_code in (200, 201)

def test_unknown_path_returns_404(client):
    r = client.get("/api/nonexistent-route", headers={"X-API-Key": "test-key"})
    assert r.status_code == 404

Middleware Ordering Tests

Middleware/plugin execution order matters. Auth should run before rate limiting (so limits are per-consumer). Logging should run after response (so it captures status codes). Transformations should run in the right direction.

def test_auth_before_rate_limiting(client):
    """Unauthenticated requests should get 401, not 429, even when rate limit is exhausted."""
    # Exhaust rate limit for authenticated consumer
    for _ in range(100):  # Way over any reasonable limit
        client.get("/api/limited", headers={"X-API-Key": "limit-exhauster"})
    
    # Unauthenticated request should still get 401, not 429
    r = client.get("/api/limited")
    assert r.status_code == 401  # Auth rejected, rate limiter never ran

def test_request_transformation_applied(client):
    """Gateway adds required headers before forwarding to upstream."""
    r = client.get("/api/echo-headers", headers={"X-API-Key": "test-key"})
    received = r.json()["headers"]
    
    # Gateway should inject these
    assert "X-Gateway-Version" in received
    assert "X-Request-Start" in received  # Timing header for APM

Health Check and Circuit Breaker Tests

def test_healthy_upstream_serves_traffic(client):
    r = client.get("/api/users", headers={"X-API-Key": "test-key"})
    assert r.status_code == 200

def test_unhealthy_upstream_triggers_503(client, mock_upstream):
    """When upstream fails health checks, gateway returns 503."""
    mock_upstream.stop()
    
    # Wait for health check detection (depends on your health check interval)
    time.sleep(5)
    
    r = client.get("/api/users", headers={"X-API-Key": "test-key"})
    assert r.status_code == 503
    assert r.headers.get("X-Gateway-Error") == "upstream-unhealthy"

def test_circuit_breaker_opens_on_repeated_failures(client, flaky_upstream):
    """After N failures, circuit breaker opens and fails fast."""
    flaky_upstream.set_failure_rate(1.0)  # Always fail
    
    responses = []
    for _ in range(10):
        r = client.get("/api/fragile", headers={"X-API-Key": "test-key"})
        responses.append(r.status_code)
    
    # Early responses: 502/504 (upstream error)
    # Later responses: 503 with circuit-open header (fast fail)
    circuit_open = any(
        r == 503 
        for r in responses[-3:]  # Check last few
    )
    assert circuit_open, f"Circuit breaker never opened. Responses: {responses}"

Observability Validation

Metrics

def test_request_metrics_exported(client, prometheus):
    """Gateway exports request count and latency metrics."""
    # Make some requests
    for _ in range(5):
        client.get("/api/users", headers={"X-API-Key": "test-key"})
    
    # Query Prometheus
    metrics = prometheus.query('gateway_requests_total{route="users-route"}')
    assert metrics[0]["value"][1] >= "5"
    
    latency = prometheus.query('gateway_request_duration_seconds{quantile="0.99"}')
    assert float(latency[0]["value"][1]) < 1.0  # p99 under 1 second

def test_error_metrics_tracked(client, prometheus):
    """4xx and 5xx responses are counted separately."""
    client.get("/api/nonexistent", headers={"X-API-Key": "test-key"})
    
    errors = prometheus.query('gateway_requests_total{status_class="4xx"}')
    assert float(errors[0]["value"][1]) >= 1

Distributed Tracing

def test_trace_headers_propagated(client):
    """Gateway forwards W3C trace context to upstream."""
    trace_id = "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01"
    
    r = client.get(
        "/api/echo-headers",
        headers={
            "X-API-Key": "test-key",
            "traceparent": trace_id
        }
    )
    
    received = r.json()["headers"]
    assert received.get("Traceparent") == trace_id

def test_gateway_adds_trace_if_missing(client):
    """Gateway generates a trace ID when none is provided."""
    r = client.get("/api/echo-headers", headers={"X-API-Key": "test-key"})
    received = r.json()["headers"]
    assert "Traceparent" in received
    assert len(received["Traceparent"]) > 0

CI Integration

Run gateway integration tests against an ephemeral gateway instance in CI.

# .github/workflows/gateway-tests.yml
name: API Gateway Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest

    services:
      gateway:
        image: your-gateway:latest
        ports:
          - 8000:8000
          - 8001:8001  # Admin API
        env:
          GATEWAY_CONFIG: /etc/gateway/test.yaml
      
      upstream-echo:
        image: kennethreitz/httpbin
        ports:
          - 8080:80

    steps:
      - uses: actions/checkout@v4

      - name: Wait for gateway
        run: |
          timeout 30 bash -c 'until curl -sf http://localhost:8001/health; do sleep 1; done'

      - name: Apply test configuration
        run: |
          ./scripts/apply-test-config.sh

      - name: Run gateway tests
        run: |
          pip install pytest httpx
          pytest tests/gateway/ -v --tb=short

      - name: Upload test results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: gateway-test-results
          path: test-results/

Common Issues and Fixes

Rate limiting not resetting — check that your time window configuration (seconds vs milliseconds) is correct. A common mistake is setting minute=60000 instead of minute=60.

Auth headers not propagating — verify your gateway's strip_authorization_header setting. Some gateways strip the original Authorization header before forwarding; you need to ensure the upstream gets the consumer identity through other headers.

Routes matching in wrong order — gateways evaluate routes in priority order. A catch-all route (/) that appears before specific routes (/api/users) will swallow all traffic. Define catch-alls last.

Timeouts hiding upstream errors — if your gateway timeout is shorter than your upstream's response time, clients see 504s instead of the actual upstream error. Set timeouts appropriately and test both timeout and error scenarios.

Testing API Gateways with HelpMeTest

You can use HelpMeTest to run continuous gateway tests without managing test infrastructure. Write your gateway tests in plain English, and HelpMeTest runs them on a schedule:

Go To https://api.example.com/health
Header Should Be Content-Type application/json
Status Should Be 200

Set Header X-API-Key invalid-key
GET https://api.example.com/protected
Status Should Be 401

Health checks run every 5 minutes and alert you when gateway behavior changes — before users notice.

Read more