Testing

Locust Load Testing: Distributed Performance Testing in Python

HelpMeTest

13 May 2026 — 7 min read

You can write load tests in Python with Locust — no XML, no GUI needed. Just a plain Python file describing what your virtual users do, then a command to run it. That's the entire model.

If you've wrestled with JMeter's thread groups and XML configuration files, Locust feels like a breath of fresh air. If you're a Python developer, it feels like home.

What Locust Is (and Why It Works)

Locust is an open-source load testing framework where you define user behavior as Python code. Each virtual user runs your task functions in a loop, making real HTTP requests (or whatever protocol you wire up). The framework handles concurrency, statistics collection, and — when you need it — distributing load across multiple machines.

The web UI gives you live charts and response time percentiles. The headless mode gives you CI-friendly exit codes. Custom load shapes let you model realistic traffic curves instead of flat ramp-ups.

The core idea: a locustfile.py is just a Python module. You can import libraries, read fixture files, do conditional logic, and share state — all the things that make XML-based tools painful.

Installation

Locust requires Python 3.8 or later. Install it with pip:

pip install locust

That's it. Verify it works:

locust --version
# locust 2.x.x

For projects, add it to your dev dependencies:

# requirements-dev.txt
locust>=2.20.0

Writing Your First Locustfile

Create locustfile.py in your project root. Here's a minimal example that tests a REST API:

from locust import HttpUser, task, between

class APIUser(HttpUser):
    wait_time = between(1, 3)

    @task
    def get_products(self):
        self.client.get("/api/products")

Three things are happening here:

HttpUser gives each virtual user an HTTP client pointed at the target host
wait_time = between(1, 3) makes each user wait 1–3 seconds between tasks, simulating real user pacing
@task marks get_products as a task — Locust will call it repeatedly

Run it with the web UI:

locust --host=https://api.example.com

Open http://localhost:8089, enter the number of users and spawn rate, and start the test. You'll see requests per second, response times, and failure rates updating in real time.

Multiple Tasks with Weights

Real users don't do one thing. Add multiple tasks with weights to control how often each runs:

from locust import HttpUser, task, between

class ShopUser(HttpUser):
    wait_time = between(1, 5)

    @task(10)
    def browse_products(self):
        self.client.get("/api/products")

    @task(3)
    def view_product(self):
        product_id = 42  # or pick one from a shared list
        self.client.get(f"/api/products/{product_id}")

    @task(1)
    def add_to_cart(self):
        self.client.post("/api/cart", json={
            "product_id": 42,
            "quantity": 1
        })

The numbers in @task(n) are weights. With weights 10, 3, 1 — out of every 14 task executions, 10 will be browse_products, 3 will be view_product, and 1 will be add_to_cart. This matches realistic traffic distribution without needing to hand-craft percentages.

on_start and on_stop Lifecycle

Some setup needs to happen once per virtual user — logging in, for example. Use on_start for per-user initialization and on_stop for cleanup:

from locust import HttpUser, task, between

class AuthenticatedUser(HttpUser):
    wait_time = between(1, 3)

    def on_start(self):
        """Called once when a user starts. Use it to log in."""
        response = self.client.post("/api/auth/login", json={
            "email": "testuser@example.com",
            "password": "testpassword123"
        })
        token = response.json()["token"]
        # Set the auth header for all subsequent requests
        self.client.headers.update({"Authorization": f"Bearer {token}"})

    def on_stop(self):
        """Called once when a user stops."""
        self.client.post("/api/auth/logout")

    @task
    def get_dashboard(self):
        self.client.get("/api/dashboard")

    @task(2)
    def list_orders(self):
        self.client.get("/api/orders")

The token gets set on the session-level headers after login, so every subsequent request carries it automatically. No need to thread auth tokens through each task function.

Parametrizing Tests with CSV Data

Hardcoded test data produces unrealistic load — caches warm up, results skew. Feed real or varied data from a CSV file instead:

import csv
import random
from locust import HttpUser, task, between, events

# Load data once at module level
with open("test_data/users.csv", "r") as f:
    reader = csv.DictReader(f)
    USER_CREDENTIALS = list(reader)

class ParametrizedUser(HttpUser):
    wait_time = between(1, 3)

    def on_start(self):
        # Each virtual user picks a random set of credentials
        creds = random.choice(USER_CREDENTIALS)
        response = self.client.post("/api/auth/login", json={
            "email": creds["email"],
            "password": creds["password"]
        })
        self.token = response.json()["token"]
        self.client.headers.update({"Authorization": f"Bearer {self.token}"})

    @task
    def get_profile(self):
        self.client.get("/api/profile")

Your users.csv file:

email,password
alice@example.com,pass123
bob@example.com,pass456
carol@example.com,pass789

This ensures different users hit different data paths, stress-testing cache misses and database joins realistically.

Headless Mode and Thresholds for CI

The web UI is great for exploration. CI needs headless runs with pass/fail conditions.

Run headless with specific user counts and duration:

locust \
  --host=https://api.example.com \
  --headless \
  --users=100 \
  --spawn-rate=10 \
  --run-time=2m \
  --csv=results/load-test

This runs 100 users for 2 minutes, spawning 10 users per second, and writes CSV output to results/load-test_stats.csv.

For hard thresholds, use the --exit-code-on-error flag combined with --stop-timeout and check the exit code in your pipeline. Or use Locust's event hooks to enforce custom failure conditions:

from locust import events

@events.quitting.add_listener
def assert_stats(environment, **kwargs):
    stats = environment.stats.total
    
    if stats.fail_ratio > 0.01:
        print(f"FAIL: error rate {stats.fail_ratio:.1%} exceeds 1%")
        environment.process_exit_code = 1
    
    if stats.get_response_time_percentile(0.95) > 500:
        print(f"FAIL: p95 response time {stats.get_response_time_percentile(0.95)}ms exceeds 500ms")
        environment.process_exit_code = 1

Add this to your locustfile.py and Locust will exit with code 1 if thresholds are breached — exactly what CI needs to fail the build.

Distributed Mode: Master and Workers

A single machine can generate tens of thousands of requests per second, but eventually you'll hit CPU limits. Locust's distributed mode lets you scale across machines.

Start the master (no load generated here, just coordination):

locust \
  --master \
  --host=https://api.example.com \
  --headless \
  --users=1000 \
  --spawn-rate=50 \
  --run-time=5m \
  --expect-workers=4

Start workers on separate machines (or containers):

locust \
  --worker \
  --master-host=192.168.1.100

The master distributes users across workers. Each worker runs actual tasks. Stats aggregate back to the master. The --expect-workers flag makes the master wait until all workers connect before starting.

For Kubernetes or Docker Compose, this maps cleanly: one master pod with the web UI exposed, N worker pods that scale horizontally.

# docker-compose.yml excerpt
services:
  locust-master:
    image: locustio/locust
    command: -f /locustfile.py --master --host=https://api.example.com
    volumes:
      - ./locustfile.py:/locustfile.py
    ports:
      - "8089:8089"

  locust-worker:
    image: locustio/locust
    command: -f /locustfile.py --worker --master-host=locust-master
    volumes:
      - ./locustfile.py:/locustfile.py
    deploy:
      replicas: 4

Custom Load Shapes

Flat ramp-ups don't reflect real traffic. A LoadTestShape lets you define any curve — spike tests, soak tests, gradual ramps, realistic daily patterns:

from locust import LoadTestShape

class StepLoadShape(LoadTestShape):
    """
    Simulates traffic stepping up every 60 seconds.
    
    Step 1 (0-60s):   50 users
    Step 2 (60-120s): 100 users
    Step 3 (120-180s): 200 users
    Step 4 (180s+):   stop
    """
    
    step_time = 60
    step_load = 50
    
    def tick(self):
        run_time = self.get_run_time()
        
        if run_time < 60:
            return (50, 10)
        elif run_time < 120:
            return (100, 10)
        elif run_time < 180:
            return (200, 20)
        else:
            return None  # None tells Locust to stop

tick() returns a (user_count, spawn_rate) tuple or None to stop. Locust calls it every second and adjusts the user pool accordingly. You can model any realistic pattern — morning traffic spike, lunch dip, evening surge — by making tick() time-aware.

CI Integration Example

A complete GitHub Actions workflow:

name: Load Test

on:
  schedule:
    - cron: '0 2 * * *'  # nightly at 2am
  workflow_dispatch:

jobs:
  load-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      
      - name: Install dependencies
        run: pip install locust
      
      - name: Run load test
        run: |
          locust \
            --host=${{ secrets.API_BASE_URL }} \
            --headless \
            --users=50 \
            --spawn-rate=5 \
            --run-time=3m \
            --csv=results/load
        env:
          API_KEY: ${{ secrets.LOAD_TEST_API_KEY }}
      
      - name: Upload results
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: load-test-results
          path: results/

The if: always() on the upload step ensures results are saved even when thresholds fail and the build exits non-zero — essential for diagnosing what went wrong.

Locust vs JMeter

The comparison that keeps coming up: Locust is code-first, JMeter is GUI-first.

JMeter stores test plans as XML files. You typically build them through a GUI, which means version control diffs are nearly unreadable, and scripting complex behavior requires learning the JMeter expression language. It's mature, battle-tested, and has a huge plugin ecosystem — but it's not a natural fit for Python teams.

Locust stores tests as Python code. You write functions. You use imports. You commit .py files to git and read diffs like any other code review. The learning curve is nearly zero for a Python developer.

JMeter generates load more efficiently per CPU core at extreme scales (millions of requests per minute). Locust is not far behind, and for most teams the distributed mode covers whatever scale you need.

Use Locust if: your team knows Python, you want tests in version control alongside app code, and you value simplicity over feature count.

Use JMeter if: you need specific protocols (FTP, JDBC, JMS), you're in a Java shop, or you have existing JMeter tests you're maintaining.

What Load Testing Tells You (and What It Doesn't)

A successful Locust run tells you: under N concurrent users, your p95 response time is Xms and your error rate is Y%. That's your performance baseline.

It does not tell you whether production traffic tomorrow will stay within that baseline. A deployment, a database migration, a traffic spike from a marketing campaign — any of these can push latency past acceptable limits without warning.

That's where production monitoring picks up. HelpMeTest runs health checks and synthetic tests against your production endpoints 24/7 — catching response time regressions as soon as they happen, not after users start complaining. Write the check once in plain English, and HelpMeTest runs it on a schedule. At $100/month, it's the layer between your Locust baseline and production reality.

Load testing tells you where your limits are. Continuous monitoring tells you when you're approaching them.

Summary

Locust load testing in Python boils down to a few clear patterns:

HttpUser + @task: the minimal unit of a load test
wait_time: controls pacing between tasks
Task weights: model realistic traffic distributions
on_start: handle per-user setup like login
CSV parametrization: avoid hardcoded test data
--headless + exit code hooks: integrate with CI
Distributed mode: scale beyond one machine
LoadTestShape: model any traffic curve

Start with a single locustfile.py, run it against your staging environment, and record your baseline. Add it to CI as a nightly job. When p95 starts climbing, you'll know before your users do.