Spike Testing with Gatling: Simulating Sudden Traffic Surges
Spike testing simulates sudden, dramatic traffic increases — a product launch going viral, a flash sale, or a marketing email hitting millions of inboxes simultaneously. Unlike stress testing (which ramps gradually), spike testing applies an immediate, extreme load increase to test instantaneous system response.
Gatling is one of the best tools for spike testing. Its Scala DSL and injection profiles give precise control over traffic patterns, and its HTML reports make it easy to spot exactly when and where systems buckle.
Why Spike Testing Matters
Gradual load increases give systems time to adapt — connection pools grow, caches warm, JVM JIT kicks in. Spike testing removes that luxury.
Real traffic spikes don't warn you. A tweet from a celebrity, a Product Hunt launch, or a surprise media mention can send traffic from normal to 10x in seconds. Spike testing answers:
- Does the system survive sudden overload?
- How long does recovery take?
- Does auto-scaling kick in fast enough?
- Do circuit breakers fire before cascading failures occur?
Gatling Basics
Gatling simulates HTTP traffic using virtual users (VUs) defined in Scala (or Java). The simulation defines:
- Protocol: base URL, headers, authentication
- Scenario: the sequence of requests a user makes
- Injection profile: how users are introduced over time
Install Gatling:
# Using the bundle
wget https://repo1.maven.org/maven2/io/gatling/highcharts/gatling-charts-highcharts-bundle/3.10.3/gatling-charts-highcharts-bundle-3.10.3-bundle.zip
unzip gatling-charts-highcharts-bundle-3.10.3-bundle.zipOr via Maven/Gradle for project integration.
Basic Spike Test Simulation
import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._
class SpikeTestSimulation extends Simulation {
val httpProtocol = http
.baseUrl("https://api.example.com")
.acceptHeader("application/json")
.contentTypeHeader("application/json")
val scn = scenario("API Spike Test")
.exec(
http("Get Items")
.get("/items")
.check(status.is(200))
)
.pause(1)
setUp(
scn.inject(
// Normal baseline
constantUsersPerSec(10).during(2.minutes),
// Spike: 10x normal traffic, instantly
atOnceUsers(1000),
// Sustain spike
constantUsersPerSec(100).during(3.minutes),
// Return to normal
constantUsersPerSec(10).during(2.minutes)
).protocols(httpProtocol)
).assertions(
global.responseTime.percentile(95).lt(2000),
global.failedRequests.percent.lt(5)
)
}The atOnceUsers(1000) injection fires 1,000 virtual users simultaneously — this is the spike. The subsequent constantUsersPerSec profile simulates sustained elevated load.
Injection Profiles for Spike Testing
Gatling provides several injection types useful for spike scenarios:
Immediate Spike
atOnceUsers(500)All 500 users hit simultaneously. Maximum instantaneous stress.
Ramp Then Spike
rampUsers(100).during(1.minute),
atOnceUsers(2000),
rampUsers(100).during(5.minutes)Establish baseline first, then spike. Tests whether the system state before the spike affects recovery.
Multiple Spikes
constantUsersPerSec(10).during(2.minutes),
atOnceUsers(500),
nothingFor(30.seconds),
atOnceUsers(500),
nothingFor(30.seconds),
atOnceUsers(500),
constantUsersPerSec(10).during(2.minutes)Tests whether systems recover between spikes, or whether repeated spikes cause cumulative degradation.
Heaviside Step Function
heavisideUsers(1000).during(10.seconds)Injects users following a smooth S-curve over 10 seconds. Less brutal than atOnceUsers, more realistic than a pure ramp.
Modeling Realistic Spike Scenarios
A spike rarely hits a single endpoint. Model realistic traffic distribution:
val browsing = scenario("Browse")
.exec(http("Home").get("/").check(status.is(200)))
.pause(2)
.exec(http("Products").get("/products").check(status.is(200)))
.pause(1)
val checkout = scenario("Checkout")
.exec(http("Cart").get("/cart").check(status.is(200)))
.pause(1)
.exec(
http("Place Order")
.post("/orders")
.body(StringBody("""{"product_id": 1, "qty": 1}"""))
.check(status.is(201))
)
setUp(
browsing.inject(
nothingFor(30.seconds),
atOnceUsers(800) // 80% browse
),
checkout.inject(
nothingFor(30.seconds),
atOnceUsers(200) // 20% checkout
)
).protocols(httpProtocol)This models a realistic spike where most users browse and a minority transact.
Adding Assertions
Assertions define pass/fail criteria:
.assertions(
// Global assertions
global.responseTime.percentile(95).lt(2000), // p95 < 2s
global.failedRequests.percent.lt(1), // < 1% errors
// Per-request assertions
details("Place Order").responseTime.percentile(99).lt(5000),
details("Browse").failedRequests.percent.lt(0.1),
)Gatling exits with a non-zero code if assertions fail, making CI integration straightforward.
Interpreting Gatling Reports
After each run, Gatling generates an HTML report in results/. Key sections:
Statistics summary: shows min, max, mean, p50, p75, p95, p99 for each request. Focus on p95 and p99 — if these are orders of magnitude above p50, you have tail latency problems.
Response time over time: the graph should stabilize during the spike, not continuously increase. A continuously rising line indicates queuing — requests aren't completing fast enough.
Active users over time: compare this with response time. If response time spikes immediately when users arrive, you have a capacity problem. If it spikes with a delay, you have a buffering/queue problem.
Percentiles distribution: see what percentage of requests fall within each latency bucket.
Testing Auto-Scaling
If your infrastructure auto-scales (Kubernetes HPA, AWS ECS, etc.), spike tests validate the scaling response time:
- Start with baseline load
- Spike to 10x — watch for initial latency increase as new pods spin up
- Monitor how long until performance normalizes
- Verify the system handles load during the scaling window
The key question: is your scaling fast enough that users experience degradation for less than 30 seconds? Less than 5 minutes? Define the acceptable window before testing.
CI/CD Integration
Run spike tests via Maven:
mvn gatling:test -Dgatling.simulationClass=SpikeTestSimulationOr via the Gatling bundle:
GATLING_HOME/bin/gatling.sh -s SpikeTestSimulationIn GitHub Actions:
- name: Run spike test
run: mvn gatling:test -Dgatling.simulationClass=SpikeTestSimulation
- name: Upload Gatling report
uses: actions/upload-artifact@v3
if: always()
with:
name: gatling-report
path: target/gatling/Run spike tests on a schedule or before major releases — they're too slow and expensive for every commit.
Common Pitfalls
Testing in production without a kill switch: always have a way to abort the test immediately if something starts failing for real users.
Insufficient warm-up: if your system has no baseline load, JVM JIT and connection pools aren't initialized. Add a brief warm-up phase before the spike.
Ignoring the cooldown: recovery behavior is part of the test. Watch whether performance returns to baseline after the spike subsides, and how long it takes.
Not correlating with infrastructure metrics: Gatling shows client-side behavior. Combine it with server-side metrics (CPU, memory, DB connections) to identify the root cause of degradation.
Conclusion
Spike testing with Gatling gives you precise control over traffic injection patterns and detailed reports for analysis. The combination of atOnceUsers for instantaneous spikes and Gatling's assertion framework makes it straightforward to validate auto-scaling behavior and define automated pass/fail criteria.
Pair spike test results with continuous monitoring. Tools like HelpMeTest verify that your application's features stay correct under and after load — functional and performance testing together provide complete production confidence.