WebSocket Load Testing: Tools, Techniques, and Benchmarks

WebSocket Load Testing: Tools, Techniques, and Benchmarks

HTTP load testing is well understood. WebSocket load testing is not. Most teams either skip it entirely or discover their WebSocket server collapses at 500 concurrent connections only after a major feature launch.

This guide covers the tools, metrics, and test scenarios that matter for WebSocket performance.

Why WebSocket Load Testing Is Different

HTTP requests are stateless — each connection opens, sends data, and closes. Load testing it means measuring requests per second.

WebSockets are stateful — connections persist for seconds, minutes, or hours. Load testing means measuring:

  • Concurrent connections sustained — how many connections can the server maintain simultaneously
  • Message throughput — messages per second at various concurrency levels
  • Message latency — time from client send to server receipt, and server send to client receipt
  • Connection setup rate — how many new connections per second the server can accept
  • Memory per connection — server memory usage grows linearly with connections; find the slope
  • Graceful degradation — what happens above the limit (rejection vs. crash vs. silent queue growth)

Tool Comparison

Tool Language WebSocket Support Scripting Best For
k6 JS Native Yes Developer-friendly, CI integration
Artillery JS Plugin Yes Quick setup, scenarios
Gatling Scala Native Yes Java teams, detailed reports
wrk (websocket fork) C Limited No Raw throughput baseline
Custom Node.js JS ws library Full control Complex protocols
locust Python geventwebsocket Yes Python teams

k6 WebSocket Load Test

k6 is the most common choice for WebSocket load testing in modern stacks.

Basic Connection Ramp Test

// websocket-load.js
import ws from 'k6/ws';
import { check, sleep } from 'k6';
import { Counter, Trend, Rate } from 'k6/metrics';

const messagesReceived = new Counter('ws_messages_received');
const messageLatency = new Trend('ws_message_latency_ms');
const connectionErrors = new Rate('ws_connection_errors');

export const options = {
  stages: [
    { duration: '30s', target: 100 },   // Ramp to 100 connections
    { duration: '1m', target: 100 },    // Hold at 100
    { duration: '30s', target: 500 },   // Ramp to 500
    { duration: '2m', target: 500 },    // Hold at 500
    { duration: '30s', target: 1000 },  // Ramp to 1000
    { duration: '2m', target: 1000 },   // Hold at 1000
    { duration: '30s', target: 0 },     // Ramp down
  ],
  thresholds: {
    'ws_connection_errors': ['rate<0.01'],           // < 1% connection errors
    'ws_message_latency_ms': ['p95<200'],            // 95th percentile < 200ms
    'ws_messages_received': ['count>1000'],          // Must receive messages
  },
};

export default function () {
  const url = 'ws://your-server.com/ws';
  const params = { headers: { Authorization: 'Bearer test-token' } };

  const response = ws.connect(url, params, function (socket) {
    socket.on('open', () => {
      // Subscribe to a channel on connect
      socket.send(JSON.stringify({ type: 'subscribe', channel: 'updates' }));
    });

    socket.on('message', (data) => {
      messagesReceived.add(1);

      const msg = JSON.parse(data);
      if (msg.timestamp) {
        const latency = Date.now() - msg.timestamp;
        messageLatency.add(latency);
      }
    });

    socket.on('error', (error) => {
      connectionErrors.add(1);
      console.error('WebSocket error:', error);
    });

    socket.on('close', () => {
      // Connection closed — recorded by k6 automatically
    });

    // Send periodic pings to keep connection alive and measure latency
    socket.setInterval(() => {
      socket.send(JSON.stringify({ type: 'ping', timestamp: Date.now() }));
    }, 5000);

    // Hold connection for the virtual user's lifetime
    socket.setTimeout(() => {
      socket.close();
    }, 60000);
  });

  check(response, {
    'Connected successfully': (r) => r && r.status === 101,
  });
}

Run it:

k6 run --out json=results.json websocket-load.js

Measuring Message Throughput

// message-throughput.js
import ws from 'k6/ws';
import { Counter, Rate } from 'k6/metrics';

const sent = new Counter('messages_sent');
const received = new Counter('messages_received');
const errors = new Rate('message_errors');

export const options = {
  vus: 100,
  duration: '2m',
};

export default function () {
  ws.connect('ws://localhost:8080/ws', {}, function (socket) {
    socket.on('open', () => {
      // High-frequency message sending
      socket.setInterval(() => {
        socket.send(JSON.stringify({
          type: 'data',
          payload: 'x'.repeat(256), // 256 byte payload
          timestamp: Date.now(),
        }));
        sent.add(1);
      }, 100); // 10 messages/second per connection
    });

    socket.on('message', () => {
      received.add(1);
    });

    socket.on('error', () => {
      errors.add(1);
    });

    socket.setTimeout(() => socket.close(), 60000);
  });
}

At 100 VUs sending 10 msg/s each = 1,000 messages/second target throughput. Watch received vs sent to find the server's saturation point.

Artillery WebSocket Scenarios

Artillery is faster to configure for scenario-based tests:

# artillery-ws.yaml
config:
  target: "ws://localhost:8080"
  phases:
    - duration: 60
      arrivalRate: 10        # 10 new connections/second
      rampTo: 100
      name: "Ramp up"
    - duration: 300
      arrivalRate: 100       # Hold at 100 connections/second
      name: "Sustained load"
  ws:
    rejectUnauthorized: false
  plugins:
    metrics-by-endpoint: {}

scenarios:
  - name: "Chat room simulation"
    engine: "ws"
    flow:
      - send:
          channel: "/"
          data: '{"type":"join","room":"load-test-room"}'
      - think: 1
      - loop:
          - send:
              channel: "/"
              data: '{"type":"message","text":"Hello from load test","timestamp":{{ $timestamp }}}'
          - think: 2
        count: 10
      - send:
          channel: "/"
          data: '{"type":"leave"}'
artillery run artillery-ws.yaml --output report.json
artillery report report.json

Custom Node.js Load Tester

When you need protocol-specific logic that tools can't handle:

// custom-ws-load.js
const WebSocket = require('ws');

const TARGET = 'ws://localhost:8080/ws';
const CONCURRENCY = 500;
const DURATION_MS = 60000;
const MESSAGE_INTERVAL_MS = 1000;

const stats = {
  connected: 0,
  failed: 0,
  messagesSent: 0,
  messagesReceived: 0,
  latencies: [],
};

async function createConnection(id) {
  return new Promise((resolve) => {
    const ws = new WebSocket(TARGET, {
      headers: { 'X-Client-Id': `load-client-${id}` },
    });

    let interval;

    ws.on('open', () => {
      stats.connected++;

      // Send periodic messages
      interval = setInterval(() => {
        if (ws.readyState === WebSocket.OPEN) {
          const msg = JSON.stringify({ id, ts: Date.now(), seq: stats.messagesSent });
          ws.send(msg);
          stats.messagesSent++;
        }
      }, MESSAGE_INTERVAL_MS);

      resolve(ws);
    });

    ws.on('message', (data) => {
      stats.messagesReceived++;
      try {
        const msg = JSON.parse(data);
        if (msg.ts) {
          stats.latencies.push(Date.now() - msg.ts);
        }
      } catch {}
    });

    ws.on('error', () => {
      stats.failed++;
      resolve(null);
    });

    ws.on('close', () => {
      clearInterval(interval);
    });

    // Timeout if connection doesn't open
    setTimeout(() => resolve(null), 5000);
  });
}

function percentile(arr, p) {
  if (arr.length === 0) return 0;
  const sorted = [...arr].sort((a, b) => a - b);
  const idx = Math.ceil((p / 100) * sorted.length) - 1;
  return sorted[idx];
}

async function runLoadTest() {
  console.log(`Starting load test: ${CONCURRENCY} connections, ${DURATION_MS / 1000}s`);

  // Ramp up connections in batches
  const connections = [];
  const BATCH_SIZE = 50;

  for (let i = 0; i < CONCURRENCY; i += BATCH_SIZE) {
    const batch = Math.min(BATCH_SIZE, CONCURRENCY - i);
    const batchConnections = await Promise.all(
      Array.from({ length: batch }, (_, j) => createConnection(i + j))
    );
    connections.push(...batchConnections.filter(Boolean));
    console.log(`Connected: ${stats.connected}/${CONCURRENCY}`);
    await new Promise(r => setTimeout(r, 100)); // Small delay between batches
  }

  console.log(`Peak connections: ${stats.connected}`);

  // Hold for duration
  await new Promise(r => setTimeout(r, DURATION_MS));

  // Collect final stats
  const p50 = percentile(stats.latencies, 50);
  const p95 = percentile(stats.latencies, 95);
  const p99 = percentile(stats.latencies, 99);

  console.log('\n=== RESULTS ===');
  console.log(`Connections:      ${stats.connected} successful, ${stats.failed} failed`);
  console.log(`Messages sent:    ${stats.messagesSent}`);
  console.log(`Messages recv:    ${stats.messagesReceived}`);
  console.log(`Throughput:       ${(stats.messagesReceived / (DURATION_MS / 1000)).toFixed(1)} msg/s`);
  console.log(`Latency p50:      ${p50}ms`);
  console.log(`Latency p95:      ${p95}ms`);
  console.log(`Latency p99:      ${p99}ms`);

  // Close all connections
  connections.forEach(ws => ws?.close());
}

runLoadTest().catch(console.error);

Key Metrics to Monitor on the Server

During load tests, monitor these on the server side:

# Linux: open file descriptors (each WebSocket = 1 fd)
<span class="hljs-built_in">cat /proc/$(pidof node)/fd <span class="hljs-pipe">| <span class="hljs-built_in">wc -l

<span class="hljs-comment"># Memory per connection
watch -n1 <span class="hljs-string">'ps -o pid,rss,vsz -p $(pidof node)'

<span class="hljs-comment"># Network connections
ss -s  <span class="hljs-comment"># Summary of socket states

<span class="hljs-comment"># Node.js: connections via app metrics endpoint
curl http://localhost:3000/metrics <span class="hljs-pipe">| grep websocket

Prometheus Metrics to Track

// In your WebSocket server
const client = require('prom-client');

const activeConnections = new client.Gauge({
  name: 'ws_active_connections',
  help: 'Currently active WebSocket connections',
});

const messageRate = new client.Counter({
  name: 'ws_messages_total',
  help: 'Total WebSocket messages processed',
  labelNames: ['direction'],
});

const connectionDuration = new client.Histogram({
  name: 'ws_connection_duration_seconds',
  help: 'WebSocket connection duration',
  buckets: [1, 5, 30, 60, 300, 600, 3600],
});

Interpreting Results

What "Good" Looks Like

Metric Target
Connection success rate > 99.9%
Message delivery rate > 99.9%
p95 latency < 100ms (same region)
Memory per connection < 50KB for typical apps
CPU at peak load < 70% to leave headroom

Common Failure Patterns

File descriptor exhaustion:

Error: EMFILE: too many open files

Fix: ulimit -n 65535 before starting the server, or configure fs.file-max in /etc/sysctl.conf

Memory leak (connections not cleaned up): Memory grows linearly with connection count but doesn't decrease after connections close. Look for event listener leaks or unclosed database connections per WebSocket session.

Thundering herd on reconnect: All clients reconnect simultaneously after a server restart. Add exponential backoff with jitter on the client side:

function reconnect(attempt = 0) {
  const delay = Math.min(1000 * Math.pow(2, attempt), 30000) + Math.random() * 1000;
  setTimeout(connect, delay);
}

Message queue buildup: Server processes messages slower than they arrive. ws.bufferedAmount grows without bound. Set a max buffer size and close slow connections.

Benchmarking Template

Use this as a baseline benchmark before making architectural changes:

// benchmark-baseline.js — run before and after changes
export const options = {
  scenarios: {
    connections_100: {
      executor: 'constant-vus',
      vus: 100,
      duration: '60s',
      tags: { scenario: '100_conns' },
    },
    connections_500: {
      executor: 'constant-vus',
      vus: 500,
      duration: '60s',
      startTime: '90s',
      tags: { scenario: '500_conns' },
    },
    connections_1000: {
      executor: 'constant-vus',
      vus: 1000,
      duration: '60s',
      startTime: '180s',
      tags: { scenario: '1000_conns' },
    },
  },
  thresholds: {
    'ws_message_latency_ms{scenario:100_conns}':  ['p95<50'],
    'ws_message_latency_ms{scenario:500_conns}':  ['p95<100'],
    'ws_message_latency_ms{scenario:1000_conns}': ['p95<200'],
  },
};

Record results in your repo so you can spot regressions:

# results/ws-baseline-2026-05-24.txt
Scenario        Connections  p50   p95   p99   Errors
100_conns       100          12ms  28ms  45ms  0
500_conns       500          18ms  67ms  92ms  0
1000_conns      1000         31ms  142ms 201ms 3

WebSocket load testing reveals limits you can't find any other way. Run these tests before every major release that touches your real-time infrastructure.

Read more