Playwright Parallel Test Execution: Sharding, Workers, and CI Speed

Playwright Parallel Test Execution: Sharding, Workers, and CI Speed

A test suite that takes 45 minutes to run provides almost no value — developers stop waiting for it and ship without knowing whether tests pass. Playwright's parallel execution model can reduce that 45 minutes to under 10 by running tests concurrently across CPU cores and, with sharding, across multiple machines.

How Playwright Parallelism Works

Playwright parallelizes at two levels:

Worker-level parallelism: Multiple worker processes run tests simultaneously on a single machine. Each worker gets its own browser context, so tests are fully isolated. By default, Playwright uses half the available CPU cores.

Shard-level parallelism: You split the entire test suite into N shards and run each shard on a separate CI machine. Each shard runs its tests with worker-level parallelism, multiplying the speedup further.

Configuring Workers

Control the number of parallel workers in your Playwright config:

// playwright.config.ts
import { defineConfig } from '@playwright/test';
import os from 'os';

export default defineConfig({
  // Use all available CPU cores
  workers: os.cpus().length,
  
  // Or set a fixed number
  // workers: 4,
  
  // Or scale based on environment
  // workers: process.env.CI ? 2 : undefined, // undefined = default (half cores)
});

For local development, the default (half of available cores) keeps your machine usable while tests run. In CI where you dedicate the machine entirely to tests, use the full core count.

Run with a specific worker count from the CLI:

npx playwright test --workers=4
npx playwright <span class="hljs-built_in">test --workers=100%  <span class="hljs-comment"># All available cores

Understanding Test Isolation Modes

Playwright offers three parallelism modes, configured per-file or globally:

// File-level parallelism (default) — tests within a file run sequentially,
// files run in parallel across workers
// playwright.config.ts: fullyParallel: false (default)

// Full parallelism — every individual test gets its own worker
export default defineConfig({
  fullyParallel: true,
});

// Sequential — all tests run in a single worker (use for debugging)
// test.describe.configure({ mode: 'serial' });

File-level parallelism (default) is safest. Tests within a describe block share context and run in order, preventing conflicts when tests depend on each other's state.

Full parallelism maximizes speed but requires every test to be completely independent — no shared mutable state, no test order dependencies.

Making Tests Parallelism-Safe

Tests that share state will fail unpredictably in parallel. Common issues to fix:

Database/storage conflicts: Two tests writing to the same record simultaneously cause race conditions. Use unique identifiers per test:

import { test } from '@playwright/test';
import { v4 as uuidv4 } from 'uuid';

test('creates a unique project', async ({ page }) => {
  const projectName = `Test Project ${uuidv4()}`;  // Unique per test run
  await page.goto('/projects/new');
  await page.fill('[data-testid="project-name"]', projectName);
  await page.click('[data-testid="create"]');
  await page.waitForURL(/\/projects\/[a-z0-9-]+/);
  // Cleanup — delete after test to avoid polluting state
});

Shared fixture state: Never mutate shared objects in fixtures. Each test should get a fresh copy:

// BAD — shared state causes parallel failures
const sharedUser = { id: 'user-1', preferences: {} };

// GOOD — fresh state per test via factory
const createTestUser = () => ({ id: `user-${Date.now()}`, preferences: {} });

Port conflicts: If tests start local servers, each needs a unique port:

// playwright.config.ts
import { defineConfig } from '@playwright/test';

export default defineConfig({
  webServer: {
    command: 'PORT=3000 npm run start',
    port: 3000,
    reuseExistingServer: !process.env.CI,
  },
});

For multiple workers, each worker reuses the same server (Playwright starts it once). This is fine — browsers connect to the same server concurrently as real users do.

Sharding Across CI Machines

Sharding splits your test suite across multiple CI runners, each running a subset:

# Run shard 1 of 4
npx playwright <span class="hljs-built_in">test --shard=1/4

<span class="hljs-comment"># Run shard 2 of 4 (on a different machine simultaneously)
npx playwright <span class="hljs-built_in">test --shard=2/4

<span class="hljs-comment"># Run shard 3 of 4
npx playwright <span class="hljs-built_in">test --shard=3/4

<span class="hljs-comment"># Run shard 4 of 4
npx playwright <span class="hljs-built_in">test --shard=4/4

Playwright distributes tests evenly across shards based on the test file list. Each shard runs independently and produces its own HTML report and test results.

GitHub Actions Sharding Setup

# .github/workflows/playwright.yml
name: Playwright Tests

on: [push, pull_request]

jobs:
  test:
    name: Shard ${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        shardIndex: [1, 2, 3, 4]
        shardTotal: [4]
    
    steps:
      - uses: actions/checkout@v4
      
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
      
      - name: Install dependencies
        run: npm ci
      
      - name: Install Playwright browsers
        run: npx playwright install --with-deps chromium
      
      - name: Run Playwright tests (shard ${{ matrix.shardIndex }}/${{ matrix.shardTotal }})
        run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
      
      - name: Upload blob report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: blob-report-${{ matrix.shardIndex }}
          path: blob-report
          retention-days: 1

  merge-reports:
    name: Merge Reports
    needs: test
    if: always()
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
      - run: npm ci
      
      - name: Download all blob reports
        uses: actions/download-artifact@v4
        with:
          pattern: blob-report-*
          merge-multiple: true
          path: all-blob-reports
      
      - name: Merge reports
        run: npx playwright merge-reports --reporter=html ./all-blob-reports
      
      - name: Upload merged HTML report
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report
          path: playwright-report
          retention-days: 30

With 4 shards on GitHub Actions, each shard runs in parallel. A 40-minute suite becomes 10 minutes of wall-clock time.

GitLab CI Sharding

# .gitlab-ci.yml
stages:
  - test
  - report

.playwright-base:
  image: mcr.microsoft.com/playwright:v1.44.0-jammy
  before_script:
    - npm ci

playwright-shard-1:
  extends: .playwright-base
  stage: test
  script:
    - npx playwright test --shard=1/4
  artifacts:
    when: always
    paths:
      - blob-report/

playwright-shard-2:
  extends: .playwright-base
  stage: test
  script:
    - npx playwright test --shard=2/4
  artifacts:
    when: always
    paths:
      - blob-report/

playwright-shard-3:
  extends: .playwright-base
  stage: test
  script:
    - npx playwright test --shard=3/4
  artifacts:
    when: always
    paths:
      - blob-report/

playwright-shard-4:
  extends: .playwright-base
  stage: test
  script:
    - npx playwright test --shard=4/4
  artifacts:
    when: always
    paths:
      - blob-report/

merge-reports:
  stage: report
  extends: .playwright-base
  needs: [playwright-shard-1, playwright-shard-2, playwright-shard-3, playwright-shard-4]
  script:
    - npx playwright merge-reports --reporter=html ./blob-report
  artifacts:
    when: always
    paths:
      - playwright-report/

Blob Reporter for Shard Merging

When sharding, each shard produces partial results. The blob reporter saves raw test data that can be merged into a unified report:

// playwright.config.ts
export default defineConfig({
  reporter: process.env.CI
    ? [['blob', { outputDir: 'blob-report' }]]
    : [['html']],
});

After all shards complete, merge the blob reports:

npx playwright merge-reports --reporter=html ./all-blob-reports

This produces one unified HTML report covering all shards, with correct pass/fail counts and the full test list.

Measuring and Tuning Parallelism

Use Playwright's built-in timing to identify bottlenecks:

# Generate HTML report with timing data
npx playwright <span class="hljs-built_in">test --reporter=html

<span class="hljs-comment"># Then open playwright-report/index.html
<span class="hljs-comment"># Sort by duration to find the slowest tests

Look for:

  • Long-running outlier tests: A single 3-minute test serializes everything waiting for it. Investigate why and break it into smaller tests.
  • Unbalanced shards: One shard takes twice as long as others. Playwright doesn't consider test duration when distributing shards. Manually move heavy test files to balance load.
  • Worker idle time: Workers waiting for shared resources (database writes, file locks). Use unique identifiers to eliminate contention.

Retries and Flaky Test Handling

Parallel execution amplifies flaky tests — more concurrent tests means more opportunities for timing issues to surface. Configure retries for CI:

export default defineConfig({
  retries: process.env.CI ? 2 : 0, // Retry twice in CI, never locally
  workers: process.env.CI ? 4 : undefined,
});

Set up Playwright's flaky test reporter to track which tests retry:

reporter: [
  ['html'],
  ['json', { outputFile: 'test-results/results.json' }],
],

Parse results.json to find tests with status: 'flaky' and prioritize fixing them — flaky tests undermine the reliability you're building parallelism to protect.

Expected Speedups

Test count Single worker 4 workers (1 machine) 4 shards × 4 workers
50 tests 25 min 7 min 2 min
200 tests 100 min 27 min 7 min
500 tests 250 min 65 min 17 min

These are rough estimates assuming 30 seconds per test average. Real speedups depend on test isolation quality — poorly isolated tests that can't run in parallel won't benefit from more workers.

Playwright's parallel execution model scales from a single developer machine to dozens of CI runners. Start with worker-level parallelism (it requires zero configuration), fix isolation issues as they surface, then add sharding when the suite grows beyond what one machine can handle in your target time window.

Read more