Locust Load Testing: Distributed Performance Testing in Python
You can write load tests in Python with Locust — no XML, no GUI needed. Just a plain Python file describing what your virtual users do, then a command to run it. That's the entire model.
If you've wrestled with JMeter's thread groups and XML configuration files, Locust feels like a breath of fresh air. If you're a Python developer, it feels like home.
What Locust Is (and Why It Works)
Locust is an open-source load testing framework where you define user behavior as Python code. Each virtual user runs your task functions in a loop, making real HTTP requests (or whatever protocol you wire up). The framework handles concurrency, statistics collection, and — when you need it — distributing load across multiple machines.
The web UI gives you live charts and response time percentiles. The headless mode gives you CI-friendly exit codes. Custom load shapes let you model realistic traffic curves instead of flat ramp-ups.
The core idea: a locustfile.py is just a Python module. You can import libraries, read fixture files, do conditional logic, and share state — all the things that make XML-based tools painful.
Installation
Locust requires Python 3.8 or later. Install it with pip:
pip install locustThat's it. Verify it works:
locust --version
# locust 2.x.xFor projects, add it to your dev dependencies:
# requirements-dev.txt
locust>=2.20.0Writing Your First Locustfile
Create locustfile.py in your project root. Here's a minimal example that tests a REST API:
from locust import HttpUser, task, between
class APIUser(HttpUser):
wait_time = between(1, 3)
@task
def get_products(self):
self.client.get("/api/products")Three things are happening here:
HttpUsergives each virtual user an HTTP client pointed at the target hostwait_time = between(1, 3)makes each user wait 1–3 seconds between tasks, simulating real user pacing@taskmarksget_productsas a task — Locust will call it repeatedly
Run it with the web UI:
locust --host=https://api.example.comOpen http://localhost:8089, enter the number of users and spawn rate, and start the test. You'll see requests per second, response times, and failure rates updating in real time.
Multiple Tasks with Weights
Real users don't do one thing. Add multiple tasks with weights to control how often each runs:
from locust import HttpUser, task, between
class ShopUser(HttpUser):
wait_time = between(1, 5)
@task(10)
def browse_products(self):
self.client.get("/api/products")
@task(3)
def view_product(self):
product_id = 42 # or pick one from a shared list
self.client.get(f"/api/products/{product_id}")
@task(1)
def add_to_cart(self):
self.client.post("/api/cart", json={
"product_id": 42,
"quantity": 1
})The numbers in @task(n) are weights. With weights 10, 3, 1 — out of every 14 task executions, 10 will be browse_products, 3 will be view_product, and 1 will be add_to_cart. This matches realistic traffic distribution without needing to hand-craft percentages.
on_start and on_stop Lifecycle
Some setup needs to happen once per virtual user — logging in, for example. Use on_start for per-user initialization and on_stop for cleanup:
from locust import HttpUser, task, between
class AuthenticatedUser(HttpUser):
wait_time = between(1, 3)
def on_start(self):
"""Called once when a user starts. Use it to log in."""
response = self.client.post("/api/auth/login", json={
"email": "testuser@example.com",
"password": "testpassword123"
})
token = response.json()["token"]
# Set the auth header for all subsequent requests
self.client.headers.update({"Authorization": f"Bearer {token}"})
def on_stop(self):
"""Called once when a user stops."""
self.client.post("/api/auth/logout")
@task
def get_dashboard(self):
self.client.get("/api/dashboard")
@task(2)
def list_orders(self):
self.client.get("/api/orders")The token gets set on the session-level headers after login, so every subsequent request carries it automatically. No need to thread auth tokens through each task function.
Parametrizing Tests with CSV Data
Hardcoded test data produces unrealistic load — caches warm up, results skew. Feed real or varied data from a CSV file instead:
import csv
import random
from locust import HttpUser, task, between, events
# Load data once at module level
with open("test_data/users.csv", "r") as f:
reader = csv.DictReader(f)
USER_CREDENTIALS = list(reader)
class ParametrizedUser(HttpUser):
wait_time = between(1, 3)
def on_start(self):
# Each virtual user picks a random set of credentials
creds = random.choice(USER_CREDENTIALS)
response = self.client.post("/api/auth/login", json={
"email": creds["email"],
"password": creds["password"]
})
self.token = response.json()["token"]
self.client.headers.update({"Authorization": f"Bearer {self.token}"})
@task
def get_profile(self):
self.client.get("/api/profile")Your users.csv file:
email,password
alice@example.com,pass123
bob@example.com,pass456
carol@example.com,pass789This ensures different users hit different data paths, stress-testing cache misses and database joins realistically.
Headless Mode and Thresholds for CI
The web UI is great for exploration. CI needs headless runs with pass/fail conditions.
Run headless with specific user counts and duration:
locust \
--host=https://api.example.com \
--headless \
--users=100 \
--spawn-rate=10 \
--run-time=2m \
--csv=results/load-testThis runs 100 users for 2 minutes, spawning 10 users per second, and writes CSV output to results/load-test_stats.csv.
For hard thresholds, use the --exit-code-on-error flag combined with --stop-timeout and check the exit code in your pipeline. Or use Locust's event hooks to enforce custom failure conditions:
from locust import events
@events.quitting.add_listener
def assert_stats(environment, **kwargs):
stats = environment.stats.total
if stats.fail_ratio > 0.01:
print(f"FAIL: error rate {stats.fail_ratio:.1%} exceeds 1%")
environment.process_exit_code = 1
if stats.get_response_time_percentile(0.95) > 500:
print(f"FAIL: p95 response time {stats.get_response_time_percentile(0.95)}ms exceeds 500ms")
environment.process_exit_code = 1Add this to your locustfile.py and Locust will exit with code 1 if thresholds are breached — exactly what CI needs to fail the build.
Distributed Mode: Master and Workers
A single machine can generate tens of thousands of requests per second, but eventually you'll hit CPU limits. Locust's distributed mode lets you scale across machines.
Start the master (no load generated here, just coordination):
locust \
--master \
--host=https://api.example.com \
--headless \
--users=1000 \
--spawn-rate=50 \
--run-time=5m \
--expect-workers=4Start workers on separate machines (or containers):
locust \
--worker \
--master-host=192.168.1.100The master distributes users across workers. Each worker runs actual tasks. Stats aggregate back to the master. The --expect-workers flag makes the master wait until all workers connect before starting.
For Kubernetes or Docker Compose, this maps cleanly: one master pod with the web UI exposed, N worker pods that scale horizontally.
# docker-compose.yml excerpt
services:
locust-master:
image: locustio/locust
command: -f /locustfile.py --master --host=https://api.example.com
volumes:
- ./locustfile.py:/locustfile.py
ports:
- "8089:8089"
locust-worker:
image: locustio/locust
command: -f /locustfile.py --worker --master-host=locust-master
volumes:
- ./locustfile.py:/locustfile.py
deploy:
replicas: 4Custom Load Shapes
Flat ramp-ups don't reflect real traffic. A LoadTestShape lets you define any curve — spike tests, soak tests, gradual ramps, realistic daily patterns:
from locust import LoadTestShape
class StepLoadShape(LoadTestShape):
"""
Simulates traffic stepping up every 60 seconds.
Step 1 (0-60s): 50 users
Step 2 (60-120s): 100 users
Step 3 (120-180s): 200 users
Step 4 (180s+): stop
"""
step_time = 60
step_load = 50
def tick(self):
run_time = self.get_run_time()
if run_time < 60:
return (50, 10)
elif run_time < 120:
return (100, 10)
elif run_time < 180:
return (200, 20)
else:
return None # None tells Locust to stoptick() returns a (user_count, spawn_rate) tuple or None to stop. Locust calls it every second and adjusts the user pool accordingly. You can model any realistic pattern — morning traffic spike, lunch dip, evening surge — by making tick() time-aware.
CI Integration Example
A complete GitHub Actions workflow:
name: Load Test
on:
schedule:
- cron: '0 2 * * *' # nightly at 2am
workflow_dispatch:
jobs:
load-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: pip install locust
- name: Run load test
run: |
locust \
--host=${{ secrets.API_BASE_URL }} \
--headless \
--users=50 \
--spawn-rate=5 \
--run-time=3m \
--csv=results/load
env:
API_KEY: ${{ secrets.LOAD_TEST_API_KEY }}
- name: Upload results
uses: actions/upload-artifact@v4
if: always()
with:
name: load-test-results
path: results/The if: always() on the upload step ensures results are saved even when thresholds fail and the build exits non-zero — essential for diagnosing what went wrong.
Locust vs JMeter
The comparison that keeps coming up: Locust is code-first, JMeter is GUI-first.
JMeter stores test plans as XML files. You typically build them through a GUI, which means version control diffs are nearly unreadable, and scripting complex behavior requires learning the JMeter expression language. It's mature, battle-tested, and has a huge plugin ecosystem — but it's not a natural fit for Python teams.
Locust stores tests as Python code. You write functions. You use imports. You commit .py files to git and read diffs like any other code review. The learning curve is nearly zero for a Python developer.
JMeter generates load more efficiently per CPU core at extreme scales (millions of requests per minute). Locust is not far behind, and for most teams the distributed mode covers whatever scale you need.
Use Locust if: your team knows Python, you want tests in version control alongside app code, and you value simplicity over feature count.
Use JMeter if: you need specific protocols (FTP, JDBC, JMS), you're in a Java shop, or you have existing JMeter tests you're maintaining.
What Load Testing Tells You (and What It Doesn't)
A successful Locust run tells you: under N concurrent users, your p95 response time is Xms and your error rate is Y%. That's your performance baseline.
It does not tell you whether production traffic tomorrow will stay within that baseline. A deployment, a database migration, a traffic spike from a marketing campaign — any of these can push latency past acceptable limits without warning.
That's where production monitoring picks up. HelpMeTest runs health checks and synthetic tests against your production endpoints 24/7 — catching response time regressions as soon as they happen, not after users start complaining. Write the check once in plain English, and HelpMeTest runs it on a schedule. At $100/month, it's the layer between your Locust baseline and production reality.
Load testing tells you where your limits are. Continuous monitoring tells you when you're approaching them.
Summary
Locust load testing in Python boils down to a few clear patterns:
HttpUser+@task: the minimal unit of a load testwait_time: controls pacing between tasks- Task weights: model realistic traffic distributions
on_start: handle per-user setup like login- CSV parametrization: avoid hardcoded test data
--headless+ exit code hooks: integrate with CI- Distributed mode: scale beyond one machine
LoadTestShape: model any traffic curve
Start with a single locustfile.py, run it against your staging environment, and record your baseline. Add it to CI as a nightly job. When p95 starts climbing, you'll know before your users do.