GoReplay: Record and Replay Production HTTP Traffic for Testing
The most realistic test data you will ever have is the traffic that hits your production service right now. Real users make requests with edge-case inputs, unusual character sets, unexpected header combinations, and query patterns that no test author would ever think to write. GoReplay (gor) is an open-source tool that captures live HTTP traffic and replays it — either in real-time as a shadow mirror, or recorded and replayed later for load testing or regression testing. It bridges the gap between synthetic test suites and the complexity of real-world traffic.
What GoReplay Does
GoReplay operates as a transparent middleware. In capture mode, it listens on a network interface and records all HTTP requests (and optionally responses) to a file or forwards them in real-time to another server. In replay mode, it reads recorded traffic and sends it to a target server, with configurable speed multipliers.
The key use cases are:
Regression testing: Record one hour of production traffic, deploy a new version to staging, replay that traffic, and compare responses between old and new versions.
Load testing: Record normal traffic, then replay at 10x speed to simulate peak load without needing synthetic scripts.
Dark launch testing: Forward all production traffic simultaneously to both the current service and a new implementation, comparing behavior without any user impact.
Debugging: Capture the exact requests that caused an error and replay them in a development environment.
Installation
GoReplay is a single binary written in Go:
# macOS with Homebrew
brew install gor
<span class="hljs-comment"># Or download from GitHub releases
wget https://github.com/buger/goreplay/releases/download/v1.3.3/gor_1.3.3_x64.tar.gz
tar xzf gor_1.3.3_x64.tar.gz
<span class="hljs-built_in">chmod +x gor
<span class="hljs-built_in">sudo <span class="hljs-built_in">mv gor /usr/local/bin/On Linux, GoReplay needs to listen on raw sockets, which requires root or CAP_NET_RAW:
sudo <span class="hljs-built_in">setcap <span class="hljs-string">'cap_net_raw=+ep' /usr/local/bin/gor
<span class="hljs-comment"># or run with sudoBasic Usage: Capture and Replay
The simplest workflow: capture traffic from one server, replay to another.
Capture to a file:
# Capture all HTTP traffic on port 8080 and write to file
<span class="hljs-built_in">sudo gor --input-raw :8080 --output-file requests.log
<span class="hljs-comment"># Capture with metadata (includes timestamps and response data)
<span class="hljs-built_in">sudo gor --input-raw :8080 \
--output-file requests.log \
--output-file-appendReplay from a file:
# Replay recorded traffic to staging server
gor --input-file requests.log \
--output-http http://staging.internal:8080
<span class="hljs-comment"># Replay at 2x speed (useful for load testing)
gor --input-file requests.log \
--output-http http://staging.internal:8080 \
--input-file-loop \
--output-http-workers 10Real-time mirroring (shadow mode):
# Mirror all live traffic from production to staging
<span class="hljs-built_in">sudo gor --input-raw :8080 \
--output-http http://production.internal:8080 \
--output-http http://staging.internal:8080In real-time mirror mode, GoReplay captures each request and sends it to both outputs. The production response goes to the real user; the staging response is discarded (or captured for comparison).
Filtering Traffic
Production traffic contains everything — health checks, monitoring probes, background jobs, and admin endpoints you do not want replayed. GoReplay has powerful filtering:
# Only replay POST requests
gor --input-file requests.log \
--output-http http://staging.internal:8080 \
--http-allow-method POST
<span class="hljs-comment"># Only replay requests to specific path prefix
gor --input-file requests.log \
--output-http http://staging.internal:8080 \
--http-allow-url <span class="hljs-string">"^/api/v2"
<span class="hljs-comment"># Exclude health check endpoints
gor --input-file requests.log \
--output-http http://staging.internal:8080 \
--http-disallow-url <span class="hljs-string">"^/health|^/metrics"
<span class="hljs-comment"># Sample 10% of traffic (useful for high-volume services)
gor --input-raw :8080 \
--output-http http://staging.internal:8080 \
--output-http http://production.internal:8080 \
--input-raw-track-response \
--http-pprof :8181 \
--stats \
--output-http-queue-size 10000 \
--output-http-workers 4 \
--http-allow-header-value <span class="hljs-string">"X-Request-Sample: true"For rate-based sampling:
# Forward only 10% of requests
gor --input-raw :8080 \
--output-http <span class="hljs-string">"http://staging.internal:8080|10%" \
--output-http <span class="hljs-string">"http://production.internal:8080|100%"Request Modification
In replay scenarios, you often need to modify requests — change authentication headers, adjust hostnames, or strip PII before storing to disk.
Replace authentication headers:
gor --input-file requests.log \
--output-http http://staging.internal:8080 \
--http-set-header "Authorization: Bearer staging-token-12345"Rewrite URLs (useful when staging has different path prefixes):
gor --input-file requests.log \
--output-http http://staging.internal:8080 \
--http-rewrite-url "/api/v1:/api/v2"Strip specific headers (remove headers that do not apply to staging):
gor --input-file requests.log \
--output-http http://staging.internal:8080 \
--http-disallow-header "X-Forwarded-For" \
--http-disallow-header <span class="hljs-string">"X-Real-IP"Middleware for custom transformation: GoReplay supports a middleware protocol where you pipe requests through a custom process for transformation:
gor --input-file requests.log \
--middleware "./my-transformer" \
--output-http http://staging.internal:8080The middleware receives requests on stdin and writes transformed requests to stdout — a simple protocol for any language.
Response Comparison for Regression Testing
The most powerful GoReplay use case for testing is comparing responses between the current version and a new version. GoReplay itself does not do the comparison, but it is easy to build with its output format.
The workflow:
- Deploy new version to staging
- Run GoReplay mirroring production traffic to staging
- Capture both production and staging responses
- Compare responses to find regressions
# Capture responses from both servers
<span class="hljs-built_in">sudo gor --input-raw :8080 \
--output-http http://production.internal:8080 \
--input-raw-track-response \
--output-file production-requests.log \
--output-http http://staging.internal:8080 \
--output-file staging-requests.logA comparison script in Python reads both logs and finds divergences:
import json
def parse_gor_log(filename):
"""Parse GoReplay log file into request/response pairs keyed by request ID"""
entries = {}
with open(filename) as f:
for line in f:
entry = json.loads(line)
req_id = entry.get('id')
if entry['type'] == 'response':
entries[req_id] = entry
return entries
prod_responses = parse_gor_log('production-requests.log')
staging_responses = parse_gor_log('staging-requests.log')
divergences = []
for req_id in prod_responses:
if req_id not in staging_responses:
continue
prod = prod_responses[req_id]
staging = staging_responses[req_id]
if prod['status'] != staging['status']:
divergences.append({
'id': req_id,
'issue': 'status_code_mismatch',
'prod': prod['status'],
'staging': staging['status']
})
elif prod['body'] != staging['body']:
divergences.append({
'id': req_id,
'issue': 'body_mismatch',
'url': prod['url']
})
print(f"Divergences found: {len(divergences)}")
for d in divergences[:10]:
print(json.dumps(d, indent=2))Load Testing with Traffic Replay
GoReplay's --rate-limit and speed multiplier features make it a capable load testing tool. Unlike synthetic load generators that use artificial request patterns, GoReplay uses real traffic shapes:
# Replay at 10x speed with 50 workers
gor --input-file requests.log \
--output-http http://staging.internal:8080 \
--output-http-workers 50 \
--rate-limit 1000 <span class="hljs-comment"># max requests per second
<span class="hljs-comment"># Loop indefinitely until stopped (for sustained load tests)
gor --input-file requests.log \
--input-file-loop \
--output-http http://staging.internal:8080 \
--output-http-workers 20 \
--output-http-timeout 5sCombined with monitoring (Prometheus, Grafana), this provides load test results based on realistic request distributions. You will find performance regressions that synthetic tests miss because the request mix does not match reality.
Kubernetes Deployment
In a Kubernetes environment, deploy GoReplay as a sidecar or as a DaemonSet:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-service
spec:
template:
spec:
containers:
- name: my-service
image: my-service:latest
ports:
- containerPort: 8080
- name: gor-mirror
image: buger/goreplay:latest
securityContext:
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
args:
- "--input-raw"
- ":8080"
- "--output-http"
- "http://staging-service.staging.svc.cluster.local:8080"
- "--http-allow-method"
- "GET,POST,PUT,DELETE"
- "--stats"
- "--output-http-workers"
- "4"Handling Authentication and Sessions
The hardest part of replaying production traffic is authentication. Tokens expire. Session cookies are tied to specific users. You cannot replay last week's traffic with last week's tokens.
Common strategies:
Replace tokens at replay time: Use --http-set-header to replace Authorization headers with a valid staging token. This works for API key authentication and service accounts.
Token mapping: For OAuth tokens, build a mapping from production user IDs to staging test user tokens. The middleware approach lets you look up the mapping and rewrite the token for each request.
Replay-safe endpoints only: For endpoints that do not require authentication (public APIs, read-only endpoints with optional auth), replaying is straightforward.
Anonymous session creation: For session-based auth, create a pool of anonymous staging sessions and randomly assign incoming requests to them.
Practical Deployment Safety
A few rules for deploying GoReplay safely:
Never write replayed requests to the production database from staging. The most common footgun: staging has a connection to the production database, and replaying POST/DELETE requests causes real data modifications. Always ensure staging uses an isolated database.
Rate-limit aggressively at first. Start with 1% of traffic mirrored to staging, verify it works, then increase.
Monitor staging resource usage. Sudden traffic mirrors can overwhelm staging if it is under-provisioned.
Filter sensitive endpoints. Payment endpoints, password reset flows, and MFA challenges should be excluded from replay.
GoReplay is one of the highest-ROI testing tools available for backend teams. It transforms real production traffic into a continuous test suite that exercises your code with realistic inputs and realistic load patterns, catching regressions and performance issues that no synthetic test suite would find.