Screenshot Comparison Tools Compared: Percy vs Applitools vs Chromatic vs BackstopJS
Visual regression testing tools have proliferated over the last five years, and the differences between them matter. Choosing the wrong tool for your team's context means either paying for features you don't need or missing capabilities that would save hours of manual QA.
This guide compares the four most commonly used screenshot comparison tools: Percy (BrowserStack), Applitools Eyes, Chromatic, and BackstopJS. For each, we cover how it works, what it costs, what it catches, and which team type it fits.
Quick Comparison
| Tool | Pricing model | Best for | Diff algorithm | Cloud/Self-hosted |
|---|---|---|---|---|
| Percy | Snapshot-based | Multi-page web apps | Pixel diff | Cloud only |
| Applitools Eyes | Checkpoint-based | Enterprise, cross-browser | Visual AI | Cloud only |
| Chromatic | Snapshot-based | Storybook component libraries | Pixel diff | Cloud only |
| BackstopJS | Open source | Self-hosted, any team | Pixel diff | Self-hosted |
Percy (BrowserStack)
How it works
Percy integrates with Cypress, Playwright, Selenium, and Storybook. You call cy.percySnapshot('name') in your tests; Percy captures the DOM snapshot and renders it server-side in their cloud, then diffs against the approved baseline.
The DOM-capture approach means Percy re-renders with consistent fonts and doesn't vary on client machine differences. Cross-browser testing runs the same snapshot in multiple browsers simultaneously.
Pricing
- Free: 5,000 screenshots/month
- Pro: starts at $599/month (35,000 screenshots)
- Each snapshot at each width = one screenshot
For a mid-size project running visual tests on every PR with 200 snapshots across 3 widths, that's 600 screenshots per run. At 20 PRs/week you'd use ~12,000 screenshots/month — the free tier lasts about 2 weeks, then you need Pro.
Strengths
- Easy integration — one npm package, one line of code per snapshot
- Clean GitHub PR workflow — Percy posts a status check, team reviews in the Percy dashboard
- DOM capture means consistent cross-browser rendering
- Good documentation, large community
Weaknesses
- No AI-powered noise reduction — pixel changes (even tiny font rendering differences) trigger reviews
- Can generate significant review noise when testing on multiple browsers
- Limited configurability for what counts as a "meaningful" visual change
- No self-hosting option
Best for
Teams running Cypress or Playwright for E2E tests who want to layer visual regression on top with minimal setup. Good fit for web apps with stable designs and low browser variance.
Applitools Eyes
How it works
Applitools uses Visual AI — a model trained on millions of UI screenshots — to distinguish real visual regressions from irrelevant rendering noise (anti-aliasing, sub-pixel font differences, shadow opacity). When comparing screenshots, the AI classifies each difference as meaningful or noise.
The Ultrafast Grid renders captured DOM snapshots across many browsers and devices in parallel, so you can test 10 browser configurations with one test run.
Pricing
- Free: 100 checkpoints/month
- Paid: per-checkpoint pricing, negotiated enterprise contracts
- The "checkpoint" model means each
eyes.check()call is one checkpoint, multiplied by the number of browsers/viewports configured
Enterprise pricing is not public — contact sales. Mid-market teams typically report $20,000-$80,000/year for meaningful usage.
Strengths
- Visual AI dramatically reduces false positives
- Ultrafast Grid for true cross-browser visual testing at scale
- Match level configuration (Strict, Content, Layout) for different types of content
- SDKs for every major testing framework
- Floating regions and ignore regions for dynamic content
- Strong enterprise features: team management, SLAs, audit trails
Weaknesses
- Significantly more expensive than alternatives
- Setup is more complex than Percy — configuration is more involved
- Cloud-only, no self-hosting
- Overkill for smaller teams or simple UIs
Best for
Enterprise teams testing complex UIs across many browsers, design system teams that need to catch subtle visual regressions in component libraries, or teams where false positives from simpler tools are creating significant review burden.
Chromatic
How it works
Chromatic is built specifically for Storybook. It captures every story in your Storybook, renders them in its cloud, and diffs against the approved baseline. Changes are reviewed in the Chromatic dashboard before the PR can merge.
TurboSnap (Chromatic's smart snapshot system) analyzes your git diff to identify which components changed, then only re-snapshots stories that could be affected — dramatically reducing snapshot count.
Pricing
- Free: 5,000 snapshots/month
- Pro: $149/month (35,000 snapshots)
- Team: $349/month (100,000 snapshots)
TurboSnap makes the free tier go significantly further for teams making targeted changes.
Strengths
- Native Storybook integration — zero configuration beyond adding the npm package
- TurboSnap reduces snapshot usage by 70-90% on typical feature branches
- Interaction tests via Storybook's
playfunction - Clean PR review workflow
- Component-level granularity — changes are scoped to individual components
- Living documentation: Chromatic publishes your Storybook as a reference
Weaknesses
- Only works with Storybook (strong requirement)
- Tests components in isolation — doesn't catch composition bugs at the page level
- No AI noise reduction — pixel-level diffs can be noisy
Best for
React, Vue, Angular, or Svelte teams using Storybook for component documentation. If you maintain a design system or component library, Chromatic is the clear choice. If you don't use Storybook, use a different tool.
BackstopJS
How it works
BackstopJS is open source and self-hosted. You configure URL/selector pairs in backstop.json, and BackstopJS uses headless Chrome (via Puppeteer) to capture screenshots and compare them against reference images stored in your repository.
{
"viewports": [
{ "label": "mobile", "width": 375, "height": 812 },
{ "label": "tablet", "width": 768, "height": 1024 },
{ "label": "desktop", "width": 1280, "height": 800 }
],
"scenarios": [
{
"label": "Homepage",
"url": "https://your-app.com/",
"selectors": ["document"],
"misMatchThreshold": 0.1
},
{
"label": "Checkout Form",
"url": "https://your-app.com/checkout",
"selectors": [".checkout-form"],
"delay": 1000,
"misMatchThreshold": 0.05
}
]
}Reference images are committed to the repository. The diff engine is Resemblejs with configurable tolerance.
Pricing
Free and open source. You pay for the infrastructure to run it (a CI runner). For most teams, this is near-zero marginal cost.
Strengths
- Completely free and open source
- Self-hosted — data never leaves your infrastructure
- Configurable tolerance threshold per scenario
- Docker container available for CI
- No per-screenshot limits
- Good fit for compliance environments that can't use cloud tools
Weaknesses
- Reference images committed to the repo → merge conflicts when multiple branches update references simultaneously
- Setup and maintenance overhead vs. cloud tools
- No built-in review workflow — you look at the diff report locally or in CI artifacts
- No AI noise reduction
- No cross-browser rendering by default (uses Puppeteer/Chrome)
- Requires careful configuration to avoid flaky screenshots
Self-hosted alternative: Playwright's built-in screenshot comparison
Playwright has built-in visual comparison without any third-party service:
// playwright.config.js
module.exports = {
expect: {
toHaveScreenshot: {
threshold: 0.2,
maxDiffPixels: 100
}
}
};
// In tests
test('homepage visual', async ({ page }) => {
await page.goto('/');
await expect(page).toHaveScreenshot('homepage.png');
});Playwright stores reference screenshots in the repository (like BackstopJS) and supports updating them with --update-snapshots. It's effectively a simpler, built-in BackstopJS — suitable for teams that are already using Playwright and want visual testing without a separate tool.
How to Choose
Choose Percy if:
- You use Cypress or Playwright for E2E tests
- You want the simplest possible integration
- Your team makes 5-20 PRs/week with <200 snapshots each
- Budget: free tier to $599+/month
Choose Applitools if:
- You need true cross-browser visual testing at scale
- False positives from pixel-level diffing are costing hours of review time
- You test complex, data-heavy UIs where subtle visual changes matter
- Budget: enterprise pricing ($20k+/year)
Choose Chromatic if:
- Your team uses Storybook
- You maintain a component library or design system
- You want component-level visual testing, not page-level
- Budget: free to $349+/month
Choose BackstopJS or Playwright screenshots if:
- You cannot use cloud services (compliance, data residency)
- You have tight budget constraints
- You're willing to invest setup and maintenance time
- You want maximum control over the tooling
Running Multiple Tools
It's not unusual for teams to use both Chromatic (component level) and Percy (page level). This gives you:
- Chromatic catches regressions in individual components early (fast feedback, targeted)
- Percy catches regressions in how components compose at the page level (broader coverage)
The cost is additive, but the coverage is more comprehensive than either tool alone.
Wrapping Up
Visual regression testing catches a class of bugs that functional tests miss: layout shifts, color changes, font regressions, missing images. Picking the right tool depends on where in the stack you need coverage (component vs page), your budget, and whether cloud services are acceptable.
Start with what integrates least disruptively into your current workflow. For most teams:
- Storybook users → Chromatic
- Playwright users → Percy or Playwright's built-in screenshots
- Cypress users → Percy
- Enterprise with complex cross-browser needs → Applitools