Pa11y CI Integration: Automated Accessibility Regression Testing

Pa11y CI Integration: Automated Accessibility Regression Testing

Pa11y CI is a command-line accessibility testing tool built for CI pipelines. Unlike browser-extension audits, it runs headlessly, accepts a list of URLs, and exits with a non-zero code when violations exceed your threshold. That makes it a straightforward gate in any pipeline.

This guide covers production-ready pa11y-ci configuration: threshold management, baseline tracking, multi-page scanning, and GitHub Actions integration.

Installation and Basic Config

npm install --save-dev pa11y-ci

Pa11y-ci reads from .pa11yci.json or a config file passed via --config. A minimal setup:

{
  "defaults": {
    "standard": "WCAG2AA",
    "runners": ["axe", "htmlcs"],
    "timeout": 30000,
    "wait": 500
  },
  "urls": [
    "https://staging.example.com/",
    "https://staging.example.com/about",
    "https://staging.example.com/contact"
  ]
}

The runners field is important. Pa11y ships with htmlcs (HTML CodeSniffer) by default. Adding axe as a second runner means you get both rule sets — they catch different issues with some overlap. Install both:

npm install --save-dev pa11y-runner-axe

Run it:

npx pa11y-ci --config .pa11yci.json

Threshold Configuration

The threshold option controls when pa11y-ci fails the build. Without it, any violation fails. With it, you can set a numeric ceiling.

{
  "defaults": {
    "threshold": 0,
    "standard": "WCAG2AA"
  }
}

threshold: 0 is the strict default — zero violations allowed. During a remediation sprint, you might temporarily raise it while working down your backlog:

{
  "defaults": {
    "threshold": 5
  }
}

Never use a high threshold as a permanent setting. Treat it as a ratchet: fix issues, lower the threshold, commit the new value. This prevents regression while acknowledging current technical debt.

Ignoring Known Issues

Some violations are known false positives or known deferred work. Pa11y supports per-URL ignore arrays:

{
  "defaults": {
    "standard": "WCAG2AA"
  },
  "urls": [
    {
      "url": "https://staging.example.com/legacy-form",
      "ignore": [
        "WCAG2AA.Principle1.Guideline1_3.1_3_1.H44.NotFormControl",
        "color-contrast"
      ],
      "threshold": 2
    },
    {
      "url": "https://staging.example.com/",
      "ignore": []
    }
  ]
}

Rule IDs in ignore must match the code pa11y reports. Get them by running pa11y-ci once and reading the output — each violation includes its code. For axe rules, use the rule ID (e.g. color-contrast, label). For htmlcs, use the full dotted code.

Document every ignored rule inline using a comment convention your team agrees on. Since JSON doesn't support comments, keep a separate pa11y-ignore-notes.md:

# Pa11y Ignored Rules

## legacy-form
- `H44.NotFormControl`: Third-party widget renders outside our control. Ticket: ACC-47.
- `color-contrast`: Brand color exception approved by design. Re-evaluate at next design system update.

Scanning Multiple Pages at Scale

For sites with many pages, hardcoding URLs doesn't scale. Generate the URL list from a sitemap:

// scripts/generate-pa11y-urls.js
const { SitemapStream, streamToPromise } = require('sitemap');
const { createReadStream } = require('fs');
const sax = require('sax');
const fs = require('fs');

async function extractUrlsFromSitemap(sitemapPath) {
  const urls = [];
  const parser = sax.createStream(true);
  
  parser.on('opentag', (node) => {
    if (node.name === 'loc') {
      // loc text is collected in text event
    }
  });

  // Simpler: just parse the XML directly
  const content = fs.readFileSync(sitemapPath, 'utf-8');
  const matches = content.matchAll(/<loc>(.*?)<\/loc>/g);
  
  for (const match of matches) {
    urls.push(match[1].replace('https://www.example.com', 'https://staging.example.com'));
  }
  
  return urls;
}

async function main() {
  const urls = await extractUrlsFromSitemap('./sitemap.xml');
  const config = {
    defaults: {
      standard: 'WCAG2AA',
      runners: ['axe', 'htmlcs'],
      timeout: 30000
    },
    urls
  };
  
  fs.writeFileSync('.pa11yci-generated.json', JSON.stringify(config, null, 2));
  console.log(`Generated config with ${urls.length} URLs`);
}

main();

For large sites, use pa11y-ci's concurrency option to speed up scanning:

{
  "concurrency": 4,
  "defaults": {
    "standard": "WCAG2AA"
  }
}

Setting concurrency too high causes timeouts and false failures. Start at 2–4 and tune based on your server's response time under load.

GitHub Actions Integration

A complete workflow that fails PRs on new violations:

# .github/workflows/accessibility.yml
name: Accessibility

on:
  pull_request:
    branches: [main]
  schedule:
    - cron: '0 6 * * 1'  # Weekly full scan on Monday

jobs:
  pa11y:
    runs-on: ubuntu-latest
    
    steps:
      - uses: actions/checkout@v4
      
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      
      - run: npm ci
      
      - name: Install Chromium for pa11y
        run: npx playwright install chromium --with-deps
      
      - name: Run accessibility scan
        run: npx pa11y-ci --config .pa11yci.json --reporter json > pa11y-report.json
        continue-on-error: true
      
      - name: Parse and display results
        run: |
          node -e "
            const report = require('./pa11y-report.json');
            const total = Object.values(report.results).flat().length;
            const pages = Object.keys(report.results).length;
            console.log('Pages scanned:', pages);
            console.log('Total violations:', total);
            
            if (total > 0) {
              Object.entries(report.results).forEach(([url, issues]) => {
                if (issues.length > 0) {
                  console.log('\n' + url + ':');
                  issues.forEach(i => console.log(' -', i.code, '|', i.message.slice(0, 80)));
                }
              });
              process.exit(1);
            }
          "
      
      - name: Upload pa11y report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: pa11y-report
          path: pa11y-report.json

Baseline Management

Ratcheting down violations over time requires tracking your baseline. Store it in version control:

# Generate baseline
npx pa11y-ci --config .pa11yci.json --reporter json > pa11y-baseline.json
git add pa11y-baseline.json
git commit -m <span class="hljs-string">"chore: update pa11y baseline to N violations"

In CI, compare current results against baseline to detect regressions:

// scripts/check-pa11y-regression.js
const current = require('./pa11y-report.json');
const baseline = require('./pa11y-baseline.json');

const currentCount = Object.values(current.results).flat().length;
const baselineCount = Object.values(baseline.results).flat().length;

if (currentCount > baselineCount) {
  console.error(`Regression detected: ${currentCount} violations (baseline: ${baselineCount})`);
  console.error(`New violations: ${currentCount - baselineCount}`);
  process.exit(1);
} else if (currentCount < baselineCount) {
  console.log(`Improvement: ${baselineCount - currentCount} violations fixed.`);
  console.log('Update baseline: npx pa11y-ci --reporter json > pa11y-baseline.json');
}

console.log(`Violations: ${currentCount} (baseline: ${baselineCount}) — OK`);

This catches regressions without requiring zero violations from day one. Teams in the middle of a remediation sprint can continue shipping while preventing new violations.

Reporter Output Formats

Pa11y-ci ships with several reporters. cli (default) is human-readable. json is machine-readable for parsing in scripts. csv works for spreadsheet-based accessibility audits.

# Human-readable
npx pa11y-ci --reporter cli

<span class="hljs-comment"># JSON for CI parsing
npx pa11y-ci --reporter json > report.json

<span class="hljs-comment"># CSV for stakeholder reports
npx pa11y-ci --reporter csv > report.csv

For teams that need HTML reports, pa11y-reporter-html produces a browsable report:

npm install --save-dev pa11y-reporter-html
npx pa11y-ci --reporter html > report.html

Pa11y-ci is most valuable as a regression gate, not a comprehensive audit. It catches structural WCAG violations automatically on every deployment. Pair it with manual screen reader testing and behavioral Playwright tests for complete coverage.

Read more