WebSocket Load Testing: Tools, Techniques, and Benchmarks
HTTP load testing is well understood. WebSocket load testing is not. Most teams either skip it entirely or discover their WebSocket server collapses at 500 concurrent connections only after a major feature launch.
This guide covers the tools, metrics, and test scenarios that matter for WebSocket performance.
Why WebSocket Load Testing Is Different
HTTP requests are stateless — each connection opens, sends data, and closes. Load testing it means measuring requests per second.
WebSockets are stateful — connections persist for seconds, minutes, or hours. Load testing means measuring:
- Concurrent connections sustained — how many connections can the server maintain simultaneously
- Message throughput — messages per second at various concurrency levels
- Message latency — time from client send to server receipt, and server send to client receipt
- Connection setup rate — how many new connections per second the server can accept
- Memory per connection — server memory usage grows linearly with connections; find the slope
- Graceful degradation — what happens above the limit (rejection vs. crash vs. silent queue growth)
Tool Comparison
| Tool | Language | WebSocket Support | Scripting | Best For |
|---|---|---|---|---|
| k6 | JS | Native | Yes | Developer-friendly, CI integration |
| Artillery | JS | Plugin | Yes | Quick setup, scenarios |
| Gatling | Scala | Native | Yes | Java teams, detailed reports |
| wrk (websocket fork) | C | Limited | No | Raw throughput baseline |
| Custom Node.js | JS | ws library | Full control | Complex protocols |
| locust | Python | geventwebsocket | Yes | Python teams |
k6 WebSocket Load Test
k6 is the most common choice for WebSocket load testing in modern stacks.
Basic Connection Ramp Test
// websocket-load.js
import ws from 'k6/ws';
import { check, sleep } from 'k6';
import { Counter, Trend, Rate } from 'k6/metrics';
const messagesReceived = new Counter('ws_messages_received');
const messageLatency = new Trend('ws_message_latency_ms');
const connectionErrors = new Rate('ws_connection_errors');
export const options = {
stages: [
{ duration: '30s', target: 100 }, // Ramp to 100 connections
{ duration: '1m', target: 100 }, // Hold at 100
{ duration: '30s', target: 500 }, // Ramp to 500
{ duration: '2m', target: 500 }, // Hold at 500
{ duration: '30s', target: 1000 }, // Ramp to 1000
{ duration: '2m', target: 1000 }, // Hold at 1000
{ duration: '30s', target: 0 }, // Ramp down
],
thresholds: {
'ws_connection_errors': ['rate<0.01'], // < 1% connection errors
'ws_message_latency_ms': ['p95<200'], // 95th percentile < 200ms
'ws_messages_received': ['count>1000'], // Must receive messages
},
};
export default function () {
const url = 'ws://your-server.com/ws';
const params = { headers: { Authorization: 'Bearer test-token' } };
const response = ws.connect(url, params, function (socket) {
socket.on('open', () => {
// Subscribe to a channel on connect
socket.send(JSON.stringify({ type: 'subscribe', channel: 'updates' }));
});
socket.on('message', (data) => {
messagesReceived.add(1);
const msg = JSON.parse(data);
if (msg.timestamp) {
const latency = Date.now() - msg.timestamp;
messageLatency.add(latency);
}
});
socket.on('error', (error) => {
connectionErrors.add(1);
console.error('WebSocket error:', error);
});
socket.on('close', () => {
// Connection closed — recorded by k6 automatically
});
// Send periodic pings to keep connection alive and measure latency
socket.setInterval(() => {
socket.send(JSON.stringify({ type: 'ping', timestamp: Date.now() }));
}, 5000);
// Hold connection for the virtual user's lifetime
socket.setTimeout(() => {
socket.close();
}, 60000);
});
check(response, {
'Connected successfully': (r) => r && r.status === 101,
});
}Run it:
k6 run --out json=results.json websocket-load.jsMeasuring Message Throughput
// message-throughput.js
import ws from 'k6/ws';
import { Counter, Rate } from 'k6/metrics';
const sent = new Counter('messages_sent');
const received = new Counter('messages_received');
const errors = new Rate('message_errors');
export const options = {
vus: 100,
duration: '2m',
};
export default function () {
ws.connect('ws://localhost:8080/ws', {}, function (socket) {
socket.on('open', () => {
// High-frequency message sending
socket.setInterval(() => {
socket.send(JSON.stringify({
type: 'data',
payload: 'x'.repeat(256), // 256 byte payload
timestamp: Date.now(),
}));
sent.add(1);
}, 100); // 10 messages/second per connection
});
socket.on('message', () => {
received.add(1);
});
socket.on('error', () => {
errors.add(1);
});
socket.setTimeout(() => socket.close(), 60000);
});
}At 100 VUs sending 10 msg/s each = 1,000 messages/second target throughput. Watch received vs sent to find the server's saturation point.
Artillery WebSocket Scenarios
Artillery is faster to configure for scenario-based tests:
# artillery-ws.yaml
config:
target: "ws://localhost:8080"
phases:
- duration: 60
arrivalRate: 10 # 10 new connections/second
rampTo: 100
name: "Ramp up"
- duration: 300
arrivalRate: 100 # Hold at 100 connections/second
name: "Sustained load"
ws:
rejectUnauthorized: false
plugins:
metrics-by-endpoint: {}
scenarios:
- name: "Chat room simulation"
engine: "ws"
flow:
- send:
channel: "/"
data: '{"type":"join","room":"load-test-room"}'
- think: 1
- loop:
- send:
channel: "/"
data: '{"type":"message","text":"Hello from load test","timestamp":{{ $timestamp }}}'
- think: 2
count: 10
- send:
channel: "/"
data: '{"type":"leave"}'artillery run artillery-ws.yaml --output report.json
artillery report report.jsonCustom Node.js Load Tester
When you need protocol-specific logic that tools can't handle:
// custom-ws-load.js
const WebSocket = require('ws');
const TARGET = 'ws://localhost:8080/ws';
const CONCURRENCY = 500;
const DURATION_MS = 60000;
const MESSAGE_INTERVAL_MS = 1000;
const stats = {
connected: 0,
failed: 0,
messagesSent: 0,
messagesReceived: 0,
latencies: [],
};
async function createConnection(id) {
return new Promise((resolve) => {
const ws = new WebSocket(TARGET, {
headers: { 'X-Client-Id': `load-client-${id}` },
});
let interval;
ws.on('open', () => {
stats.connected++;
// Send periodic messages
interval = setInterval(() => {
if (ws.readyState === WebSocket.OPEN) {
const msg = JSON.stringify({ id, ts: Date.now(), seq: stats.messagesSent });
ws.send(msg);
stats.messagesSent++;
}
}, MESSAGE_INTERVAL_MS);
resolve(ws);
});
ws.on('message', (data) => {
stats.messagesReceived++;
try {
const msg = JSON.parse(data);
if (msg.ts) {
stats.latencies.push(Date.now() - msg.ts);
}
} catch {}
});
ws.on('error', () => {
stats.failed++;
resolve(null);
});
ws.on('close', () => {
clearInterval(interval);
});
// Timeout if connection doesn't open
setTimeout(() => resolve(null), 5000);
});
}
function percentile(arr, p) {
if (arr.length === 0) return 0;
const sorted = [...arr].sort((a, b) => a - b);
const idx = Math.ceil((p / 100) * sorted.length) - 1;
return sorted[idx];
}
async function runLoadTest() {
console.log(`Starting load test: ${CONCURRENCY} connections, ${DURATION_MS / 1000}s`);
// Ramp up connections in batches
const connections = [];
const BATCH_SIZE = 50;
for (let i = 0; i < CONCURRENCY; i += BATCH_SIZE) {
const batch = Math.min(BATCH_SIZE, CONCURRENCY - i);
const batchConnections = await Promise.all(
Array.from({ length: batch }, (_, j) => createConnection(i + j))
);
connections.push(...batchConnections.filter(Boolean));
console.log(`Connected: ${stats.connected}/${CONCURRENCY}`);
await new Promise(r => setTimeout(r, 100)); // Small delay between batches
}
console.log(`Peak connections: ${stats.connected}`);
// Hold for duration
await new Promise(r => setTimeout(r, DURATION_MS));
// Collect final stats
const p50 = percentile(stats.latencies, 50);
const p95 = percentile(stats.latencies, 95);
const p99 = percentile(stats.latencies, 99);
console.log('\n=== RESULTS ===');
console.log(`Connections: ${stats.connected} successful, ${stats.failed} failed`);
console.log(`Messages sent: ${stats.messagesSent}`);
console.log(`Messages recv: ${stats.messagesReceived}`);
console.log(`Throughput: ${(stats.messagesReceived / (DURATION_MS / 1000)).toFixed(1)} msg/s`);
console.log(`Latency p50: ${p50}ms`);
console.log(`Latency p95: ${p95}ms`);
console.log(`Latency p99: ${p99}ms`);
// Close all connections
connections.forEach(ws => ws?.close());
}
runLoadTest().catch(console.error);Key Metrics to Monitor on the Server
During load tests, monitor these on the server side:
# Linux: open file descriptors (each WebSocket = 1 fd)
<span class="hljs-built_in">cat /proc/$(pidof node)/fd <span class="hljs-pipe">| <span class="hljs-built_in">wc -l
<span class="hljs-comment"># Memory per connection
watch -n1 <span class="hljs-string">'ps -o pid,rss,vsz -p $(pidof node)'
<span class="hljs-comment"># Network connections
ss -s <span class="hljs-comment"># Summary of socket states
<span class="hljs-comment"># Node.js: connections via app metrics endpoint
curl http://localhost:3000/metrics <span class="hljs-pipe">| grep websocketPrometheus Metrics to Track
// In your WebSocket server
const client = require('prom-client');
const activeConnections = new client.Gauge({
name: 'ws_active_connections',
help: 'Currently active WebSocket connections',
});
const messageRate = new client.Counter({
name: 'ws_messages_total',
help: 'Total WebSocket messages processed',
labelNames: ['direction'],
});
const connectionDuration = new client.Histogram({
name: 'ws_connection_duration_seconds',
help: 'WebSocket connection duration',
buckets: [1, 5, 30, 60, 300, 600, 3600],
});Interpreting Results
What "Good" Looks Like
| Metric | Target |
|---|---|
| Connection success rate | > 99.9% |
| Message delivery rate | > 99.9% |
| p95 latency | < 100ms (same region) |
| Memory per connection | < 50KB for typical apps |
| CPU at peak load | < 70% to leave headroom |
Common Failure Patterns
File descriptor exhaustion:
Error: EMFILE: too many open filesFix: ulimit -n 65535 before starting the server, or configure fs.file-max in /etc/sysctl.conf
Memory leak (connections not cleaned up): Memory grows linearly with connection count but doesn't decrease after connections close. Look for event listener leaks or unclosed database connections per WebSocket session.
Thundering herd on reconnect: All clients reconnect simultaneously after a server restart. Add exponential backoff with jitter on the client side:
function reconnect(attempt = 0) {
const delay = Math.min(1000 * Math.pow(2, attempt), 30000) + Math.random() * 1000;
setTimeout(connect, delay);
}Message queue buildup: Server processes messages slower than they arrive. ws.bufferedAmount grows without bound. Set a max buffer size and close slow connections.
Benchmarking Template
Use this as a baseline benchmark before making architectural changes:
// benchmark-baseline.js — run before and after changes
export const options = {
scenarios: {
connections_100: {
executor: 'constant-vus',
vus: 100,
duration: '60s',
tags: { scenario: '100_conns' },
},
connections_500: {
executor: 'constant-vus',
vus: 500,
duration: '60s',
startTime: '90s',
tags: { scenario: '500_conns' },
},
connections_1000: {
executor: 'constant-vus',
vus: 1000,
duration: '60s',
startTime: '180s',
tags: { scenario: '1000_conns' },
},
},
thresholds: {
'ws_message_latency_ms{scenario:100_conns}': ['p95<50'],
'ws_message_latency_ms{scenario:500_conns}': ['p95<100'],
'ws_message_latency_ms{scenario:1000_conns}': ['p95<200'],
},
};Record results in your repo so you can spot regressions:
# results/ws-baseline-2026-05-24.txt
Scenario Connections p50 p95 p99 Errors
100_conns 100 12ms 28ms 45ms 0
500_conns 500 18ms 67ms 92ms 0
1000_conns 1000 31ms 142ms 201ms 3WebSocket load testing reveals limits you can't find any other way. Run these tests before every major release that touches your real-time infrastructure.