BuildKite Testing: Parallel Test Runs and Test Analytics
Buildkite takes a different approach to CI/CD. Rather than running builds on Buildkite's infrastructure, you run Buildkite Agents on your own machines or cloud. Buildkite coordinates the work; your infrastructure executes it. This gives you full control over the build environment while keeping the pipeline management in Buildkite's cloud.
The other major feature that sets Buildkite apart is Test Analytics — a built-in dashboard that tracks test performance over time, surfaces flaky tests, and shows which tests are slowest. For teams with large test suites, this visibility is valuable.
How Buildkite Works
Buildkite Agents are lightweight daemons that run on your machines. You install agents on your servers, cloud VMs, or Kubernetes pods. Agents poll Buildkite for work, execute pipeline steps, and report results back.
Pipelines are defined in .buildkite/pipeline.yml. They can also be dynamically generated using scripts — a powerful feature for generating parallelized test configurations programmatically.
Plugins extend pipeline steps with pre-built functionality, like Docker integration, test uploading, and caching.
Installing Buildkite Agents
On Ubuntu/Debian:
echo deb https://apt.buildkite.com/buildkite-agent stable main \
<span class="hljs-pipe">| <span class="hljs-built_in">sudo <span class="hljs-built_in">tee /etc/apt/sources.list.d/buildkite-agent.list
curl -fsSL https://keys.openpgp.org/vks/v1/by-fingerprint/32A37959C2FA5C3C99EFBC32A79206BE8F8979C9 \
<span class="hljs-pipe">| <span class="hljs-built_in">sudo gpg --dearmor -o /etc/apt/keyrings/buildkite-agent.gpg
<span class="hljs-built_in">sudo apt-get update && <span class="hljs-built_in">sudo apt-get install buildkite-agent
<span class="hljs-comment"># Configure with your agent token
<span class="hljs-built_in">sudo sed -i <span class="hljs-string">"s/xxx/$BUILDKITE_AGENT_TOKEN/g" /etc/buildkite-agent/buildkite-agent.cfg
<span class="hljs-built_in">sudo systemctl <span class="hljs-built_in">enable buildkite-agent && <span class="hljs-built_in">sudo systemctl start buildkite-agentOn macOS:
brew install buildkite/buildkite/buildkite-agent
buildkite-agent startWith Docker:
docker run -d \
-e BUILDKITE_AGENT_TOKEN=$BUILDKITE_AGENT_TOKEN \
buildkite/agentBasic Pipeline Configuration
# .buildkite/pipeline.yml
steps:
- label: ":jest: Unit Tests"
command: |
npm ci
npm test
agents:
queue: default
- label: ":python: Python Tests"
command: |
pip install -r requirements.txt
pytest --junitxml=junit.xml
agents:
queue: defaultPipeline files live in .buildkite/pipeline.yml and are uploaded to Buildkite via the API or the Buildkite UI.
Running Tests with Docker
Using the Docker plugin for isolated environments:
steps:
- label: "Unit Tests"
plugins:
- docker#v5.10.0:
image: node:20
command: ["sh", "-c", "npm ci && npm test"]
environment:
- NODE_ENV=testDocker Compose plugin for services:
steps:
- label: "Integration Tests"
plugins:
- docker-compose#v5.1.0:
run: app
config: docker-compose.test.yml# docker-compose.test.yml
version: '3.8'
services:
app:
build: .
command: npm run test:integration
environment:
DATABASE_URL: postgresql://test:test@postgres/testdb
depends_on:
- postgres
postgres:
image: postgres:15
environment:
POSTGRES_USER: test
POSTGRES_PASSWORD: test
POSTGRES_DB: testdbParallel Test Execution
Buildkite's parallel feature is clean and explicit:
steps:
- label: "Tests (shard %n)"
command: |
npm ci
npx jest --shard=$BUILDKITE_PARALLEL_JOB_INDEX/$BUILDKITE_PARALLEL_JOB_COUNT
parallelism: 4Setting parallelism: 4 creates 4 identical steps that run simultaneously on different agents. Buildkite provides:
$BUILDKITE_PARALLEL_JOB_INDEX— 0-based index of the current job (0, 1, 2, 3)$BUILDKITE_PARALLEL_JOB_COUNT— total number of parallel jobs (4)
For Playwright:
steps:
- label: "E2E Tests (shard %n)"
command: |
npm ci
npx playwright install --with-deps
npx playwright test --shard=$((BUILDKITE_PARALLEL_JOB_INDEX + 1))/$BUILDKITE_PARALLEL_JOB_COUNT
parallelism: 4Dynamic parallelism with pipeline upload
For truly dynamic test splitting, generate the pipeline at runtime:
# scripts/generate-pipeline.sh
<span class="hljs-comment">#!/bin/bash
<span class="hljs-comment"># Count test files and generate appropriate shards
TEST_COUNT=$(find . -name <span class="hljs-string">"*.test.js" <span class="hljs-pipe">| <span class="hljs-built_in">wc -l)
SHARD_COUNT=$(( (TEST_COUNT + <span class="hljs-number">9) / <span class="hljs-number">10 )) <span class="hljs-comment"># 10 tests per shard
<span class="hljs-built_in">cat <<<span class="hljs-string">EOF
steps:
EOF
<span class="hljs-keyword">for i <span class="hljs-keyword">in $(<span class="hljs-built_in">seq 0 $((SHARD_COUNT - <span class="hljs-number">1))); <span class="hljs-keyword">do
<span class="hljs-built_in">cat <<<span class="hljs-string">EOF
- label: "Tests shard $i"
command: npx jest --shard=$((i + 1))/$SHARD_COUNT
EOF
<span class="hljs-keyword">doneIn your pipeline:
steps:
- label: "Generate pipeline"
command: "bash scripts/generate-pipeline.sh | buildkite-agent pipeline upload"Buildkite Test Analytics
Test Analytics is Buildkite's built-in service for tracking test performance. It ingests test results and provides:
- Historical pass/fail rates per test
- Execution time trends
- Flaky test detection
- Slowest tests by duration
Setting up Test Analytics
- Go to Buildkite → Test Analytics → Create a test suite
- Get your API token
- Upload test results from your pipeline
Upload JUnit XML results:
steps:
- label: "Tests"
command: |
npm ci
npx jest --ci --reporters=default --reporters=jest-junit
plugins:
- test-collector#v1.10.2:
files: "test-results/**/*.xml"
format: "junit"
api-token-env-var: BUILDKITE_ANALYTICS_TOKENThe plugin reads your JUNIT XML and sends results to Buildkite Test Analytics.
Ruby (RSpec)
steps:
- label: "RSpec"
command: |
bundle install
bundle exec rspec --format progress --format RspecJunitFormatter --out test-results/rspec.xml
plugins:
- test-collector#v1.10.2:
files: "test-results/rspec.xml"
format: "junit"
api-token-env-var: BUILDKITE_ANALYTICS_TOKENPython (pytest)
steps:
- label: "pytest"
command: |
pip install -r requirements.txt
pytest --junitxml=test-results/junit.xml
plugins:
- test-collector#v1.10.2:
files: "test-results/junit.xml"
format: "junit"
api-token-env-var: BUILDKITE_ANALYTICS_TOKENGo
steps:
- label: "Go tests"
command: |
go test ./... -v 2>&1 | go-junit-report > test-results/junit.xml
plugins:
- test-collector#v1.10.2:
files: "test-results/junit.xml"
format: "junit"
api-token-env-var: BUILDKITE_ANALYTICS_TOKENCaching with the Cache Plugin
steps:
- label: "Tests"
plugins:
- gencer/cache#v2.4.9:
backend: s3
bucket: your-buildkite-cache-bucket
region: us-east-1
key: "v1-npm-{{ checksum 'package-lock.json' }}"
paths:
- node_modules/
command: |
npm ci
npm testOr using the file system on self-hosted agents:
steps:
- label: "Tests"
command: |
# Restore cache if exists
if [ -d /cache/node_modules ]; then
cp -r /cache/node_modules ./node_modules
fi
npm ci
npm test
# Save cache
cp -r ./node_modules /cache/Environment Variables and Secrets
Set environment variables at the pipeline level in Buildkite settings or pass them to steps:
steps:
- label: "Integration Tests"
command: npm run test:integration
env:
NODE_ENV: test
LOG_LEVEL: errorFor secrets, use the Buildkite Elastic CI Stack's secret store or configure environment hooks:
# ~/.buildkite/hooks/environment
<span class="hljs-built_in">export DATABASE_URL=$(aws secretsmanager get-secret-value \
--secret-id prod/database/url \
--query SecretString \
--output text)Pipeline Step Dependencies
steps:
- label: "Unit Tests"
key: unit-tests
command: npm run test:unit
- label: "Build"
key: build
command: npm run build
depends_on:
- unit-tests # only run after unit tests pass
- label: "E2E Tests"
depends_on:
- build
command: npx playwright test
- wait: ~ # wait for all parallel steps above
- label: "Deploy"
command: npm run deploy
branches: mainSoft Fails
Allow a step to fail without failing the entire build:
steps:
- label: "Flaky E2E Tests"
command: npx playwright test
soft_fail: true # build continues even if this fails
- label: "Deploy"
command: npm run deployOr soft-fail on specific exit codes:
steps:
- label: "Linting"
command: npm run lint
soft_fail:
- exit_status: 1Agent Tags for Specialized Hardware
Route steps to specific agent types:
steps:
- label: "Mac E2E Tests"
command: npx playwright test --project=safari
agents:
os: macOS
- label: "GPU Tests"
command: python test_ml_model.py
agents:
gpu: "true"
- label: "High Memory Tests"
command: npm run test:large-dataset
agents:
memory: highConfigure agent tags when starting the agent:
buildkite-agent start --tags "os=macOS,memory=high"Buildkite Secrets with AWS
The official Buildkite Elastic CI Stack uses AWS S3 for secret storage:
steps:
- label: "Tests"
command: |
# Secrets are automatically available via the environment hook
npm run test:integrationThe environment hook fetches secrets from S3 and exports them before each step runs.
Conclusion
Buildkite's agent architecture means your builds run on infrastructure you control — your network, your hardware, your security boundaries. The pipeline YAML is clean and the parallel test execution with built-in sharding variables makes distributing tests across agents straightforward.
Test Analytics is the standout feature for larger teams. Getting visibility into which tests are flaky, which are slow, and how test health trends over time is valuable for maintaining a reliable test suite.
Start with a basic pipeline, add the Docker plugin for environment isolation, configure parallelism for your largest test suites, and set up Test Analytics to gain visibility. The agent model requires more upfront infrastructure setup than fully-hosted CI, but the control and performance benefits pay off at scale.