Traffic Mirroring with NGINX and Envoy for Dark Launch Testing

Traffic Mirroring with NGINX and Envoy for Dark Launch Testing

Dark launch is the practice of running a new version of a service in production — receiving real requests, doing real work — while still serving responses from the current version. Users see nothing different. Your new service processes every request, you measure its behavior, and only when you are confident does the switch happen. Traffic mirroring at the proxy layer, using NGINX or Envoy, is the infrastructure that makes dark launches possible without any application code changes.

The Dark Launch Pattern

The pattern has three components:

  1. Proxy: a reverse proxy that receives production requests and sends each one to two places — the production service (for real responses) and the shadow service (for validation)
  2. Production service: the current implementation; its responses go to users
  3. Shadow service: the new implementation; its responses are captured for analysis but never sent to users

The shadow receives every request exactly as the production service does. Its responses can be compared to production responses to find divergences. Its logs and metrics can be monitored for error rates, latency, and unexpected behavior. All of this happens without any user impact.

NGINX Mirror Module

NGINX added the ngx_http_mirror_module in version 1.13.4. It is included in the standard build and requires no extra installation.

The configuration is straightforward:

upstream production {
    server production-service:8080;
}

upstream shadow {
    server shadow-service:8080;
}

server {
    listen 80;

    location / {
        # Mirror requests to shadow (asynchronous, response is discarded)
        mirror /shadow;
        mirror_request_body on;

        # Serve responses from production
        proxy_pass http://production;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }

    # Internal mirror location
    location = /shadow {
        internal;
        proxy_pass http://shadow$request_uri;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Shadow-Mode "true";
        # Do not wait for shadow response
        proxy_read_timeout 1s;
        proxy_connect_timeout 1s;
    }
}

Key points about the NGINX mirror:

  • mirror_request_body on is required to mirror POST/PUT request bodies, not just headers
  • The mirror location is internal — it cannot be accessed directly from outside
  • NGINX waits for the mirror response before returning the production response to the client. If the shadow is slow, it delays the user response. Set short timeouts.
  • The shadow response is discarded completely — NGINX does not compare or log it

To avoid the latency penalty of waiting for the shadow:

location = /shadow {
    internal;
    proxy_pass http://shadow$request_uri;
    proxy_pass_request_headers on;
    # These settings make NGINX not wait
    proxy_ignore_client_abort on;
    proxy_read_timeout 100ms;
    proxy_connect_timeout 100ms;
    proxy_send_timeout 100ms;
}

With 100ms timeouts, the shadow call is effectively fire-and-forget for any shadow that takes longer than 100ms to respond. The shadow still receives the request; NGINX just stops waiting.

Mirroring Specific Traffic

You do not have to mirror all traffic. Mirror only the routes that matter for your dark launch:

# Only mirror API v2 traffic
location /api/v2/ {
    mirror /shadow-v2;
    mirror_request_body on;
    proxy_pass http://production;
}

# Mirror only POST and PUT requests (skip GETs for write-heavy shadow)
location / {
    if ($request_method ~* "POST|PUT|DELETE") {
        mirror /shadow;
    }
    mirror_request_body on;
    proxy_pass http://production;
}

# Mirror a percentage of traffic (using split_clients)
split_clients $request_id $mirror_target {
    10% "yes";
    * "";
}

server {
    location / {
        if ($mirror_target = "yes") {
            mirror /shadow;
        }
        proxy_pass http://production;
    }
}

The split_clients approach is useful for high-volume services where mirroring 100% of traffic would overwhelm the shadow service.

Envoy Request Mirroring

Envoy Proxy provides request mirroring as a first-class feature in its route configuration. Unlike NGINX, Envoy mirrors are fire-and-forget by default — the main response is returned immediately without waiting for the mirror.

Envoy configuration (YAML):

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 8080
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: production_cluster
                  request_mirror_policies:
                  - cluster: shadow_cluster
                    runtime_fraction:
                      default_value:
                        numerator: 100
                        denominator: HUNDRED
          http_filters:
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

  clusters:
  - name: production_cluster
    connect_timeout: 5s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: production_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: production-service
                port_value: 8080

  - name: shadow_cluster
    connect_timeout: 5s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: shadow_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: shadow-service
                port_value: 8080

The runtime_fraction: 100/HUNDRED means 100% of requests are mirrored. Change to numerator: 10 for 10% mirroring.

Envoy with xDS for Dynamic Mirroring

In Kubernetes environments, you can control Envoy mirroring dynamically through xDS APIs. Using Istio (which uses Envoy as its data plane):

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: my-service
spec:
  hosts:
  - my-service
  http:
  - route:
    - destination:
        host: my-service
        subset: v1
      weight: 100
    mirror:
      host: my-service
      subset: v2
    mirrorPercentage:
      value: 100.0
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: my-service
spec:
  host: my-service
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2

This sends 100% of traffic to v1 (serving real users) while mirroring 100% to v2 (shadow). You can start with mirrorPercentage: 10 and ramp up as confidence grows.

Capturing and Comparing Shadow Responses

The shadow service receives real requests but its responses go nowhere by default. To capture and compare them, you need to instrument the shadow:

Option 1: Shadow logs to a centralized store

Add a middleware to your shadow service that logs requests and responses to a database or message queue:

import json
import time
import logging
from functools import wraps
from kafka import KafkaProducer

producer = KafkaProducer(
    bootstrap_servers=['kafka:9092'],
    value_serializer=lambda v: json.dumps(v).encode('utf-8')
)

def shadow_log_middleware(app):
    @wraps(app)
    def wrapper(environ, start_response):
        start = time.time()
        request_body = environ.get('wsgi.input', b'').read()
        
        response_body = []
        status_code = [None]
        
        def capture_start_response(status, headers, exc_info=None):
            status_code[0] = status
            return start_response(status, headers, exc_info)
        
        response = app(environ, capture_start_response)
        response_body = b''.join(response)
        
        producer.send('shadow-responses', {
            'path': environ.get('PATH_INFO'),
            'method': environ.get('REQUEST_METHOD'),
            'request_id': environ.get('HTTP_X_REQUEST_ID'),
            'status': status_code[0],
            'latency_ms': (time.time() - start) * 1000,
            'response_body': response_body.decode('utf-8', errors='replace')
        })
        
        return [response_body]
    return wrapper

Option 2: Capture with Envoy access logging

Configure Envoy to log shadow responses separately:

access_log:
- name: envoy.access_loggers.file
  filter:
    response_flag_filter:
      flags: ["UF"]  # log only shadow responses
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
    path: /var/log/envoy/shadow-access.log
    log_format:
      json_format:
        start_time: "%START_TIME%"
        request_id: "%REQ(X-REQUEST-ID)%"
        response_code: "%RESPONSE_CODE%"
        upstream_host: "%UPSTREAM_HOST%"

Safety Considerations

Dark launch testing has real safety risks:

Database writes: If the shadow service writes to a database, you will have duplicate data, duplicate emails sent, duplicate payments charged. Always either:

  • Use a separate shadow database (read from production replica, write to shadow DB)
  • Make shadow writes no-ops (stub out write operations)
  • Use feature flags to disable side effects in shadow mode (if os.getenv('SHADOW_MODE'): return)

External API calls: The shadow service should not call payment processors, email services, or SMS providers with real data. Mock all external services in shadow mode.

Rate limits: If your shadow hits rate-limited external APIs, it will eat into your production rate limit budget.

PII and compliance: Shadow traffic contains real user data. Ensure your shadow logging complies with your data retention and privacy policies.

Measuring Dark Launch Success

The dark launch is done when:

  1. Error rate in shadow matches production (within noise)
  2. Critical response differences < 1% (after noise filtering)
  3. P99 latency in shadow is acceptable
  4. No unexpected warnings or errors in shadow logs
  5. Data written to shadow database is consistent with expected schema

Run the dark launch for at least 24 hours to cover daily traffic patterns. Then gradually shift real traffic to the new service using a weighted load balancer while keeping the mirror running until you reach 100% new traffic.

Traffic mirroring with NGINX or Envoy is infrastructure-level dark launching — it requires no changes to your application code, works transparently with any backend language, and gives you real production traffic as your test input. For any significant service migration or rewrite, it is the safest path to deployment.

Read more