Testing

Gatling Load Testing: Performance Tests as Code with Scala and Java

HelpMeTest

13 May 2026 — 8 min read

Load testing tools have a reputation for being painful. JMeter XML files grow into unmaintainable tangles. Shell scripts with curl loops miss the statistical rigor you need. Gatling takes a different approach: express your load test as code, run it on the JVM, and get a polished HTML report when it's done.

The learning curve is real — Gatling's primary DSL is Scala, and if you haven't touched Scala before, the syntax will look unfamiliar. But the DSL itself is readable once you know the building blocks, and the payoff is load tests that live in version control alongside your application code, reviewed in pull requests, and run in CI.

What Gatling Is (and Isn't)

Gatling is an open-source load testing tool built on the JVM, using an async, non-blocking engine under the hood. That architecture matters: a single Gatling process can simulate thousands of concurrent virtual users with far less memory than thread-per-user tools, because virtual users are modeled as lightweight state machines rather than OS threads.

The core open-source version covers HTTP, WebSocket, and Server-Sent Events. Gatling Enterprise (formerly FrontLine) adds distributed load injection, advanced metrics, and integrations, but the free version is enough for most teams to get started.

Compared to JMeter, the most widely used alternative, Gatling's key differences are:

Code, not XML. Gatling simulations are Scala (or Java/Kotlin) source files. JMeter's test plans are XML, which is hard to diff, hard to review, and impossible to refactor with a compiler's help.
Async engine. JMeter uses one thread per virtual user. Gatling's engine is non-blocking, so you can drive more load from a single machine.
HTML reports. Gatling generates interactive, percentile-rich reports out of the box. JMeter's default reports are sparse; meaningful graphs require plugins.
Developer ownership. Because simulations are code, developers can write and maintain them without a dedicated performance engineer who knows how to navigate a GUI.

JMeter has its own strengths — a large plugin ecosystem, support for many protocols, and a GUI that non-developers can use. But if your team already owns its test code and runs CI pipelines, Gatling fits more naturally.

Installation: Maven and Gradle

The easiest way to start is with the Gatling Maven plugin. Add it to your pom.xml:

<build>
  <plugins>
    <plugin>
      <groupId>io.gatling</groupId>
      <artifactId>gatling-maven-plugin</artifactId>
      <version>4.9.0</version>
      <configuration>
        <simulationClass>simulations.BasicSimulation</simulationClass>
      </configuration>
    </plugin>
  </plugins>
</build>

<dependencies>
  <dependency>
    <groupId>io.gatling.highcharts</groupId>
    <artifactId>gatling-charts-highcharts</artifactId>
    <version>3.11.0</version>
    <scope>test</scope>
  </dependency>
</dependencies>

For Gradle, apply the plugin in build.gradle:

plugins {
  id 'io.gatling.gradle' version '3.11.0.2'
}

dependencies {
  gatling 'io.gatling.highcharts:gatling-charts-highcharts:3.11.0'
}

Place simulation files under src/test/scala (Maven) or src/gatling/scala (Gradle). Run with:

# Maven
mvn gatling:<span class="hljs-built_in">test

<span class="hljs-comment"># Gradle
./gradlew gatlingRun

Core Concepts

Before writing your first simulation, know the four building blocks:

Simulation — a Scala class that extends Simulation. It wires together an HTTP configuration, one or more scenarios, and an injection profile.

Scenario — a sequence of steps (HTTP requests, pauses, conditionals) that models what one virtual user does.

Injection profile — how virtual users are introduced over time. You might ramp from 0 to 100 users over 60 seconds, or hold 50 users constant for 5 minutes.

Assertion — pass/fail criteria evaluated after the test. The test fails (and returns a non-zero exit code) if assertions aren't met, which is what you need for CI integration.

Feeders and checks round out the toolkit: feeders inject dynamic data (usernames, product IDs) into requests, and checks validate response status codes, JSON fields, and response times inline.

Writing a Simulation

Here is a complete, runnable simulation targeting a hypothetical REST API:

package simulations

import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._

class BasicSimulation extends Simulation {

  // 1. HTTP protocol configuration — shared across all requests
  val httpProtocol = http
    .baseUrl("https://api.example.com")
    .acceptHeader("application/json")
    .contentTypeHeader("application/json")
    .userAgentHeader("Gatling/LoadTest")

  // 2. Feeder — CSV provides a pool of test credentials
  val userFeeder = csv("users.csv").circular

  // 3. Scenario — what one virtual user does
  val browseScenario = scenario("Browse products")
    .exec(
      http("GET /health")
        .get("/health")
        .check(status.is(200))
    )
    .pause(1.second)
    .feed(userFeeder)
    .exec(
      http("POST /auth/login")
        .post("/auth/login")
        .body(StringBody("""{"email":"#{email}","password":"#{password}"}"""))
        .check(status.is(200))
        .check(jsonPath("$.token").saveAs("authToken"))
    )
    .pause(500.milliseconds, 1500.milliseconds)
    .exec(
      http("GET /products")
        .get("/products?page=1&limit=20")
        .header("Authorization", "Bearer #{authToken}")
        .check(status.is(200))
        .check(jsonPath("$.items[0].id").exists)
    )
    .pause(1.second, 3.seconds)
    .exec(
      http("GET /products/:id")
        .get("/products/42")
        .header("Authorization", "Bearer #{authToken}")
        .check(status.is(200))
        .check(responseTimeInMillis.lte(500))
    )

  // 4. Injection profile + assertions
  setUp(
    browseScenario.inject(
      atOnceUsers(5),                              // spike: 5 users immediately
      nothingFor(10.seconds),
      rampUsers(50).during(60.seconds),            // ramp to 50 users over 1 min
      constantUsersPerSec(10).during(120.seconds)  // sustain 10 req/s for 2 min
    )
  )
    .protocols(httpProtocol)
    .assertions(
      global.responseTime.percentile(95).lt(1000),   // p95 < 1s
      global.responseTime.mean.lt(500),              // mean < 500ms
      global.failedRequests.percent.lt(1)            // < 1% error rate
    )
}

The users.csv feeder file is straightforward:

email,password
alice@example.com,secret1
bob@example.com,secret2
carol@example.com,secret3

The .circular strategy means Gatling loops through the rows when virtual users outnumber rows, which is fine for load tests where you want representative data, not unique data per user.

Injection Profiles

Choosing the right injection profile determines what question your test answers.

// Spike test: all users at once — tests cold-start behavior
atOnceUsers(100)

// Ramp test: gradual increase — finds the point where latency degrades
rampUsers(200).during(2.minutes)

// Sustained load: constant arrival rate — models steady production traffic
constantUsersPerSec(20).during(5.minutes)

// Stress test: ramp up forever until something breaks
rampUsersPerSec(1).to(100).during(10.minutes)

// Stepped ramp: staircase pattern — isolates behavior at each concurrency level
incrementUsersPerSec(10)
  .times(5)
  .eachLevelLasting(30.seconds)
  .startingFrom(10)

A complete load test often chains these. Start with atOnceUsers to verify your environment handles a burst without crashing, then rampUsers to find where p95 latency starts climbing, then constantUsersPerSec to confirm it's stable at your expected production rate.

Checks and Feeders

Checks are assertions on individual responses, evaluated during the test. The test continues regardless of check failures (they count toward error rate), unlike setup assertions which fail the whole run at the end.

// Status code
.check(status.is(201))
.check(status.in(200, 201, 202))

// JSON extraction — capture a value for use in later requests
.check(jsonPath("$.order.id").saveAs("orderId"))

// Response time (individual request, not aggregate)
.check(responseTimeInMillis.lte(300))

// Body string contains
.check(bodyString.contains("\"status\":\"active\""))

For feeders beyond CSV, Gatling supports JSON:

val productFeeder = jsonFile("products.json").random

And you can create feeders programmatically when test data needs to be generated, not read from a file:

val idFeeder = Iterator.continually(
  Map("productId" -> (scala.util.Random.nextInt(1000) + 1).toString)
)

The Java API

If Scala is a barrier for your team, Gatling 3.7+ ships a Java DSL. The concepts are identical; the syntax uses a builder-style API instead of Scala's implicit conversions:

import io.gatling.javaapi.core.*;
import io.gatling.javaapi.http.*;
import static io.gatling.javaapi.core.CoreDsl.*;
import static io.gatling.javaapi.http.HttpDsl.*;

public class BasicSimulationJava extends Simulation {

  HttpProtocolBuilder httpProtocol = http
    .baseUrl("https://api.example.com")
    .acceptHeader("application/json");

  ScenarioBuilder scenario = scenario("Browse")
    .exec(http("GET /health").get("/health").check(status().is(200)))
    .pause(1)
    .exec(http("GET /products").get("/products").check(status().is(200)));

  {
    setUp(scenario.injectOpen(rampUsers(50).during(60)))
      .protocols(httpProtocol)
      .assertions(global().responseTime().percentile(95).lt(1000));
  }
}

The Scala DSL is more concise, and most Gatling documentation and community examples are written in Scala, so it's worth learning even if your application is Java. The Java API is a good fallback for teams that can't justify a Scala dependency.

Reading the HTML Report

After a test run, Gatling writes a report to target/gatling/<simulation-name>-<timestamp>/index.html. Open it in a browser.

The summary page shows global numbers: total requests, success rate, and response time statistics (min, mean, p50, p75, p95, p99, max). Below that, each named request gets its own row so you can see which endpoints are slow without digging through raw logs.

The detail view for each request shows response time distribution as a histogram and a time-series line chart overlaid with the active user count. This is where you look for the hockey stick — the point in a ramp test where latency shoots up as concurrency crosses your application's capacity.

Apdex scores (a standardized measure of user satisfaction given a response time threshold) appear if you configure apdexT in the HTTP protocol. The default threshold is 800ms.

Running in CI

Because Gatling returns a non-zero exit code when assertions fail, CI integration is straightforward. For GitHub Actions:

name: Load Test

on:
  schedule:
    - cron: '0 2 * * 1'   # weekly, Monday 2am
  workflow_dispatch:

jobs:
  gatling:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-java@v4
        with:
          java-version: '21'
          distribution: 'temurin'
      - name: Run Gatling simulation
        run: mvn gatling:test -Dgatling.simulationClass=simulations.BasicSimulation
        env:
          BASE_URL: ${{ secrets.STAGING_URL }}
      - name: Upload report
        uses: actions/upload-artifact@v4
        if: always()
        with:
          name: gatling-report
          path: target/gatling/

Point the simulation at your staging environment. Run it weekly, or on every release branch merge. The uploaded artifact gives you the full HTML report in the Actions UI.

One caution: Gatling simulations put real load on your target environment. Don't point them at production without rate limiting the injection profile and coordinating with whoever owns the infrastructure. Staging or a dedicated load environment is the standard practice.

What Gatling Finds (and What It Doesn't)

Gatling is excellent at finding capacity limits, slow database queries that only appear under concurrency, connection pool exhaustion, and memory leaks that manifest over time. It answers: how many users can this API handle before response times degrade or errors spike?

What it doesn't do: catch that a specific endpoint returns the wrong JSON field, that your authentication flow breaks for users in a specific timezone, or that a new deploy introduced a regression on one particular page. Those are functional correctness problems, not performance problems.

The gap matters in production. Your staging load test passes at 500 concurrent users with p95 under 800ms. You deploy. A week later, a customer reports that the order confirmation email stopped sending. Gatling wouldn't have caught that — it wasn't measuring email delivery.

That's where continuous production monitoring fills in. HelpMeTest runs health checks and natural language tests against your live endpoints 24/7, so functional regressions surface immediately after deploy rather than in a customer support ticket. At $100/month, it's the complement to your load testing suite: Gatling tells you the system can handle the load, HelpMeTest tells you it's actually doing the right thing under that load.

Where to Go Next

The Gatling documentation is thorough. The "Advanced Tutorial" covers more complex session handling and conditional logic. The Gatling community forum and GitHub discussions are active if you run into edge cases.

For practical next steps: start with a single critical endpoint (login, your main API endpoint, checkout), write a ramp test, and see where your p95 latency breaks 1 second. That number tells you more about your system's real capacity than any synthetic benchmark.

Load tests as code, in CI, with assertions — that's the baseline. Everything beyond it is refinement.