Accessibility Testing in CI: Automating a11y Checks in Your Pipeline

Accessibility Testing in CI: Automating a11y Checks in Your Pipeline

Catching accessibility regressions in CI is straightforward once you pick the right tools and decide what to block on. The challenge is not technical — it's deciding your enforcement strategy: block on any violation, block on regressions from baseline, or track and report without blocking.

This guide covers the tools, the configuration, and the tradeoffs for each approach.

Why CI Accessibility Testing

Manual accessibility testing is slow and happens late. By the time a screen reader tester catches a missing label on a new form, that form has been reviewed, merged, deployed, and is in front of users. CI accessibility checks move the catch point to where it's cheapest to fix: the pull request.

Automated CI checks catch roughly 30–40% of WCAG issues. That's not everything, but it's the consistent, repeatable, automatable subset. The goal is to own that 30–40% without manual effort, freeing human testers to focus on the remaining 60–70% that requires judgment.

Tool Selection

Three main tools fit CI well:

axe-core

Most flexible. Runs against any page you can reach with a browser. Deep integration with Playwright, Cypress, Jest. Detailed violation output including the exact DOM element that failed.

Best for: Integration into existing Playwright/Cypress test suites, component-level testing, testing authenticated flows.

pa11y-ci

URL-list based. Give it a sitemap or list of URLs, it crawls each one and reports violations. Minimal configuration needed. Built on axe or HTML_CodeSniffer.

Best for: Static sites, quick full-site audits, teams that don't have existing E2E test infrastructure.

Lighthouse CI

Google's Lighthouse includes accessibility scoring. Returns a score (0–100) rather than a violation list. Less detailed than axe, but integrates well into GitHub Actions and has its own dashboard.

Best for: Teams that already use Lighthouse for performance testing and want a single tool.

axe-core + Playwright in CI

This is the most commonly used pattern for dynamic web applications.

Directory structure:

tests/
  a11y/
    public-pages.test.js
    authenticated-pages.test.js
    interactive-states.test.js
  auth.setup.js
playwright.config.js

playwright.config.js:

import { defineConfig } from '@playwright/test';

export default defineConfig({
  testDir: './tests',
  use: {
    baseURL: process.env.BASE_URL || 'http://localhost:3000',
  },
  projects: [
    {
      name: 'setup',
      testMatch: /auth\.setup\.js/,
    },
    {
      name: 'a11y-public',
      testMatch: /a11y\/public-pages\.test\.js/,
    },
    {
      name: 'a11y-authenticated',
      testMatch: /a11y\/authenticated-pages\.test\.js/,
      dependencies: ['setup'],
      use: {
        storageState: 'playwright/.auth/user.json',
      },
    },
  ],
});

tests/a11y/public-pages.test.js:

import { test, expect } from '@playwright/test';
import AxeBuilder from '@axe-core/playwright';

const PAGES = ['/', '/about', '/pricing', '/login', '/register'];

for (const path of PAGES) {
  test(`${path}: no WCAG 2.1 AA violations`, async ({ page }) => {
    await page.goto(path);
    await page.waitForLoadState('networkidle');

    const results = await new AxeBuilder({ page })
      .withTags(['wcag2a', 'wcag2aa', 'wcag21aa'])
      .analyze();

    expect(results.violations).toEqual([]);
  });
}

GitHub Actions workflow:

name: Accessibility

on:
  pull_request:
  push:
    branches: [main]

jobs:
  a11y:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm

      - run: npm ci

      - name: Install Playwright
        run: npx playwright install --with-deps chromium

      - name: Build
        run: npm run build

      - name: Start server
        run: npm run start &
        env:
          PORT: 3000
          NODE_ENV: test

      - name: Wait for server
        run: npx wait-on http://localhost:3000

      - name: Run a11y tests
        run: npx playwright test tests/a11y/

      - name: Upload report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: a11y-report
          path: playwright-report/
          retention-days: 7

pa11y-ci for Static or Public Sites

pa11y-ci is simpler to configure when you want to scan a list of URLs without writing test code.

npm install --save-dev pa11y-ci

.pa11yci.json:

{
  "defaults": {
    "standard": "WCAG2AA",
    "timeout": 30000,
    "wait": 500,
    "runners": ["axe"],
    "ignore": [
      "WCAG2AA.Principle1.Guideline1_4.1_4_3.G18.Fail"
    ]
  },
  "urls": [
    "http://localhost:3000/",
    "http://localhost:3000/about",
    "http://localhost:3000/pricing",
    "http://localhost:3000/blog",
    {
      "url": "http://localhost:3000/contact",
      "actions": [
        "wait for element form to be visible"
      ]
    }
  ]
}

CI:

- name: Run pa11y-ci
  run: npx pa11y-ci --config .pa11yci.json

With sitemap:

npx pa11y-ci --sitemap http://localhost:3000/sitemap.xml

Lighthouse CI

Lighthouse CI runs Lighthouse audits on each commit and can compare scores against a baseline.

npm install --save-dev @lhci/cli

lighthouserc.js:

module.exports = {
  ci: {
    collect: {
      url: ['http://localhost:3000/', 'http://localhost:3000/about'],
      numberOfRuns: 1,
    },
    assert: {
      assertions: {
        'categories:accessibility': ['error', { minScore: 0.9 }],
      },
    },
    upload: {
      target: 'temporary-public-storage',
    },
  },
};

CI:

- name: Run Lighthouse CI
  run: |
    npx lhci autorun
  env:
    LHCI_GITHUB_APP_TOKEN: ${{ secrets.LHCI_GITHUB_APP_TOKEN }}

Lighthouse CI can post PR comments with score changes and links to full reports.

Enforcement Strategy

Option 1: Fail on any violation (strict)

Best for new projects with no existing violations. Simple rule: zero violations allowed.

expect(results.violations).toEqual([]);

If you introduce a violation, the build fails. Violations must be fixed before merging.

Option 2: Baseline snapshot (practical for legacy projects)

If you have existing violations you can't fix immediately, snapshot the current violation count and fail only when it increases.

// First run: generate snapshot
import fs from 'fs';

test('capture baseline', async ({ page }) => {
  await page.goto('/');
  const results = await new AxeBuilder({ page }).analyze();

  const baseline = results.violations.map(v => ({
    id: v.id,
    impact: v.impact,
    nodeCount: v.nodes.length,
  }));

  fs.writeFileSync('a11y-baseline.json', JSON.stringify(baseline, null, 2));
});
// Subsequent runs: compare against baseline
import baseline from '../a11y-baseline.json';

test('no new violations vs baseline', async ({ page }) => {
  await page.goto('/');
  const results = await new AxeBuilder({ page }).analyze();

  const currentIds = new Set(results.violations.map(v => v.id));
  const baselineIds = new Set(baseline.map(v => v.id));

  const newViolations = [...currentIds].filter(id => !baselineIds.has(id));
  expect(newViolations).toEqual([]);
});

Commit a11y-baseline.json. Update it as you fix violations. This prevents regression without requiring you to fix everything before getting CI working.

Option 3: Track and report without blocking

The least aggressive option. Capture violations and post them as PR comments or to a dashboard, but don't fail the build.

Use this only as a temporary measure while bootstrapping. Unblocked accessibility violations accumulate.

test('accessibility scan (non-blocking)', async ({ page }) => {
  await page.goto('/');
  const results = await new AxeBuilder({ page }).analyze();

  // Log violations as warnings
  for (const violation of results.violations) {
    console.warn(`a11y: [${violation.impact}] ${violation.id}${violation.description}`);
  }

  // Always passes, but data is in CI logs
  expect(true).toBe(true);
});

Handling False Positives

Some rules produce false positives in specific contexts. Disable them precisely:

const results = await new AxeBuilder({ page })
  .withTags(['wcag2a', 'wcag2aa'])
  .disableRules([
    'color-contrast', // server-side-rendered colors axe can't compute
  ])
  .analyze();

Document why each rule is disabled in a comment. False positive suppressions tend to accumulate — review them quarterly.

Integrating Into Pull Request Feedback

Use the Playwright GitHub reporter to surface failures inline in PRs:

// playwright.config.js
export default defineConfig({
  reporter: [
    ['html'],
    ['github'], // posts annotations on PR files
  ],
});

Or use a custom reporter that posts a PR comment with the violation summary when the accessibility job fails.

The Full Pipeline Pattern

A complete accessibility pipeline runs three things at different speeds:

  1. Per-PR: axe scan of changed pages (fast, blocks on regression)
  2. Daily: Full site scan of all routes (slower, catches drift)
  3. Monthly: Manual screen reader walkthrough of key user flows (slowest, catches what automation misses)

The automated layers keep regressions out. The manual layer finds the systemic issues that automation can't detect. Neither replaces the other.

Read more