Stress Testing with k6: Ramp-Up Patterns and Finding Breaking Points

Stress Testing with k6: Ramp-Up Patterns and Finding Breaking Points

k6 is the go-to tool for modern stress and load testing. It uses JavaScript for test scripts, outputs structured metrics, and integrates cleanly with CI/CD pipelines. This guide focuses specifically on stress testing patterns — how to ramp up load, detect breaking points, and extract actionable data from the results.

Why k6 for Stress Testing?

k6's executor model makes it well-suited for stress testing:

  • Ramping VUs executor: gradually increases virtual users over time
  • Ramping arrival rate executor: controls request rate instead of concurrency
  • Scenarios: run multiple load patterns simultaneously
  • Thresholds: define failure conditions in the script itself

The JavaScript API is expressive enough to implement any ramp-up strategy without boilerplate.

Basic Ramp-Up Script

The simplest stress test ramps users up until something breaks:

import http from 'k6/http';
import { sleep, check } from 'k6';

export const options = {
  stages: [
    { duration: '5m', target: 100 },   // warm up
    { duration: '10m', target: 500 },  // ramp to moderate load
    { duration: '10m', target: 1000 }, // ramp to high load
    { duration: '10m', target: 2000 }, // push to breaking point
    { duration: '5m', target: 0 },     // cool down
  ],
  thresholds: {
    http_req_failed: ['rate<0.05'],    // fail if error rate > 5%
    http_req_duration: ['p95<2000'],   // fail if p95 > 2s
  },
};

export default function () {
  const res = http.get('https://your-api.example.com/endpoint');
  check(res, { 'status 200': (r) => r.status === 200 });
  sleep(1);
}

The stages array drives the ramp-up. Each stage specifies a duration and a target VU count — k6 interpolates between stages automatically.

Ramp-Up Patterns

Different systems break under different conditions. Choose your pattern based on what you're testing.

Linear Ramp

The default: gradually increase users at a constant rate. Best for finding the general breaking point.

stages: [
  { duration: '30m', target: 1000 },
  { duration: '5m', target: 0 },
],

Staircase Ramp

Hold each level for a fixed duration before stepping up. Useful for observing system behavior at each load level and identifying where performance starts degrading.

stages: [
  { duration: '5m', target: 100 },
  { duration: '5m', target: 100 }, // hold
  { duration: '5m', target: 300 },
  { duration: '5m', target: 300 }, // hold
  { duration: '5m', target: 600 },
  { duration: '5m', target: 600 }, // hold
  { duration: '5m', target: 1000 },
  { duration: '5m', target: 1000 }, // hold
  { duration: '5m', target: 0 },
],

The "hold" phases let you observe steady-state behavior at each level — memory trends, GC activity, and queue buildup become visible.

Spike Pattern

Sudden traffic surge to test elasticity. Relevant for auto-scaling validation.

stages: [
  { duration: '2m', target: 100 },   // normal
  { duration: '30s', target: 2000 }, // spike
  { duration: '3m', target: 2000 },  // sustain spike
  { duration: '1m', target: 100 },   // recovery
  { duration: '2m', target: 100 },   // observe recovery
],

Watch whether the system recovers to baseline performance after the spike subsides.

Arrival Rate vs Virtual Users

k6 offers two fundamentally different ways to drive load:

Virtual Users (VUs): fixed number of concurrent users, each making requests as fast as they complete. Load depends on response time — slower responses = fewer requests/second.

Arrival Rate: fixed number of requests per second, regardless of response time. If the system slows down, requests queue up.

For stress testing, arrival rate is often more realistic:

export const options = {
  scenarios: {
    stress: {
      executor: 'ramping-arrival-rate',
      startRate: 10,
      timeUnit: '1s',
      preAllocatedVUs: 50,
      maxVUs: 500,
      stages: [
        { duration: '10m', target: 100 },
        { duration: '10m', target: 300 },
        { duration: '10m', target: 500 },
      ],
    },
  },
};

preAllocatedVUs pre-warms the VU pool. maxVUs caps concurrency when response times increase under load.

Detecting the Breaking Point

k6 doesn't automatically identify your breaking point — you need to observe metrics and interpret the curve.

Signs the system is breaking:

  1. Error rate climbs: http_req_failed rate increases
  2. p95 latency diverges from p50: tail latency indicates queuing
  3. Throughput plateaus while VUs keep increasing: requests are waiting
  4. Sudden error spike: connection pool exhausted, circuit breaker tripped

Add a custom metric to track throughput vs load:

import { Counter, Rate, Trend } from 'k6/metrics';

const errorRate = new Rate('error_rate');
const throughput = new Counter('requests_total');

export default function () {
  const res = http.get('https://your-api.example.com/endpoint');
  errorRate.add(res.status !== 200);
  throughput.add(1);
}

Multi-Endpoint Stress Testing

Real applications have multiple critical paths. Test them simultaneously:

export const options = {
  scenarios: {
    api_reads: {
      executor: 'ramping-vus',
      stages: [
        { duration: '10m', target: 500 },
        { duration: '10m', target: 1000 },
      ],
      exec: 'readScenario',
    },
    api_writes: {
      executor: 'ramping-vus',
      stages: [
        { duration: '10m', target: 100 },
        { duration: '10m', target: 300 },
      ],
      exec: 'writeScenario',
    },
  },
};

export function readScenario() {
  http.get('https://api.example.com/items');
  sleep(1);
}

export function writeScenario() {
  http.post('https://api.example.com/items', JSON.stringify({ name: 'test' }), {
    headers: { 'Content-Type': 'application/json' },
  });
  sleep(2);
}

Write operations typically have lower throughput limits than reads — this models realistic traffic ratios.

Thresholds and Automated Failure

Define pass/fail conditions in the script:

thresholds: {
  http_req_failed: ['rate<0.01'],         // < 1% errors
  http_req_duration: ['p95<500', 'p99<1000'], // latency bounds
  'http_req_duration{endpoint:checkout}': ['p99<2000'], // per-endpoint
},

k6 exits with a non-zero code when thresholds are violated. This integrates naturally with CI — the pipeline fails when performance regresses.

Running k6 in CI

Basic GitHub Actions integration:

- name: Run stress test
  run: |
    k6 run --vus 100 --duration 5m \
      --out json=results.json \
      stress-test.js
  
- name: Upload results
  uses: actions/upload-artifact@v3
  with:
    name: k6-results
    path: results.json

For longer stress tests, run them on a schedule rather than on every commit — they're too slow for PR feedback loops.

Analyzing Results

k6 outputs a summary after each run. Key metrics to examine:

http_req_duration.............: avg=234ms min=12ms med=180ms max=8420ms p(90)=445ms p(95)=890ms p(99)=4230ms
http_req_failed...............: 2.14% ✗ 1284 ✓ 58621
http_reqs.....................: 59905 vus_max=1200

High p99 with acceptable p95 indicates a long tail — some requests are experiencing extreme slowness. This often points to lock contention, GC pauses, or connection pool saturation.

Conclusion

k6's executor model and scenario API make it one of the most flexible tools for stress testing. The key is choosing the right ramp-up pattern for your hypothesis: linear ramps for finding breaking points, staircase patterns for observing degradation at each level, and spike patterns for testing elasticity.

Once you've found your breaking point, pair stress testing results with functional monitoring. Tools like HelpMeTest keep watch on your application's behavior after you've validated its performance limits — ensuring that fixes hold and regressions don't slip through.

Read more