Playwright Retry Strategies for Flaky Tests: retries, repeat-each, and test.fail()
Flaky tests cost CI time and erode confidence. Playwright gives you several tools to manage flakiness: automatic retries that reduce false negatives, --repeat-each for detecting flakiness before it hits CI, and test.fail() for acknowledging known issues without deleting coverage.
This post covers all three strategies — when to use each, how to configure them, and how to avoid the trap of masking real failures with excessive retries.
Understanding Playwright's Retry Mechanism
Playwright retries operate at the test level, not the assertion level. When a test fails, Playwright re-runs the entire test from the beginning — a fresh browser context, clean state, full setup/teardown cycle. This is different from retrying individual actions.
Retries are configured in playwright.config.ts:
import { defineConfig } from "@playwright/test";
export default defineConfig({
retries: process.env.CI ? 2 : 0,
// ...
});The common pattern: no retries locally (failures should be investigated immediately) and 2 retries in CI (to handle infrastructure flakiness). A test that fails 3 consecutive times is reported as failed.
What Retries Catch (and What They Don't)
Retries help with transient failures caused by:
- Network timeouts from slow CI infrastructure
- Flaky third-party services during test setup
- Race conditions that occur rarely
Retries don't help with:
- Consistent bugs in your code
- Tests with broken assertions
- Tests that fail due to shared state (they'll fail the same way every time)
This distinction matters. If a test retries twice and passes on the third attempt consistently, the root cause is transient flakiness. If it fails all three attempts consistently, there's a real bug — retries are just delaying the failure report.
Configuring Retries by Project
Playwright supports multiple projects (browser configurations). You can set different retry counts per project:
export default defineConfig({
projects: [
{
name: "chromium",
use: { ...devices["Desktop Chrome"] },
retries: 2,
},
{
name: "firefox",
use: { ...devices["Desktop Firefox"] },
retries: 3, // More retries if Firefox is known to be slower in CI
},
{
name: "mobile-safari",
use: { ...devices["iPhone 14"] },
retries: 2,
},
],
});Viewing Retry Information
Playwright's HTML reporter shows retry attempts with full traces. Run tests with the HTML reporter:
npx playwright test --reporter=htmlAfter the run, open playwright-report/index.html. Each failed test shows:
- Which attempt number failed
- The full trace for each attempt
- Screenshots and videos if configured
For CI, enable the reporter in your config:
export default defineConfig({
reporter: [["html", { open: "never" }], ["github"]],
retries: process.env.CI ? 2 : 0,
});The github reporter adds annotations to pull requests with test failure details.
Using --repeat-each for Flakiness Detection
--repeat-each runs every test N times in a single execution. This is your flakiness detection tool — run it before merging a PR to catch tests that fail intermittently.
npx playwright test --repeat-each=10This runs each test 10 times. A test that passes 9/10 times is flaky. Catch it before it reaches CI and starts failing randomly.
Use it in a pre-merge check:
# .github/workflows/flakiness-check.yml
name: Flakiness Check
on:
pull_request:
paths:
- "tests/**"
- "src/**"
jobs:
flakiness:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- run: npm ci && npx playwright install --with-deps
- name: Run new/modified tests repeatedly
run: |
npx playwright test --repeat-each=5 \
--grep "$(git diff origin/main --name-only -- tests/ | tr '\n' '|')"This runs only the tests that changed in the PR, 5 times each. If any fail, the PR is blocked.
test.fail(): Acknowledging Known Failures
test.fail() marks a test as expected to fail. If the test fails, it's reported as an "expected failure" (green). If the test passes, it's reported as an "unexpected pass" (red warning).
test.fail("user can export data to CSV", async ({ page }) => {
await page.goto("/settings");
await page.click("[data-testid=export-csv]");
await expect(page.locator(".download-started")).toBeVisible();
});This is useful for:
- Known bugs that haven't been fixed yet — you keep the test in place, mark it expected-to-fail, and track the issue separately
- Features that are partially implemented — the test documents the expected behavior before it's complete
- Platform-specific failures — a feature that works in Chrome but has a known Firefox bug
The critical rule: test.fail() should be temporary. Link it to a bug ticket:
// TODO: Remove test.fail() when HEL-342 is fixed
// Issue: Export button unresponsive on Firefox 124
test.fail(
({ browserName }) => browserName === "firefox",
"Export button broken on Firefox - tracked in HEL-342"
);
test("user can export data to CSV", async ({ page }) => {
// test body
});The conditional form of test.fail() takes a predicate and a reason. The test is expected to fail only when the predicate is true — in this case, only on Firefox.
test.fixme(): Skip with Intent to Fix
test.fixme() is similar to test.fail() but skips the test entirely:
test.fixme("user can delete account", async ({ page }) => {
// This test will be skipped
});Use test.fixme() when the test is broken and you're not sure yet whether to fix the test or the code. It signals "this needs attention" without cluttering CI output.
The semantic difference:
test.fail()— the feature is broken, we know it, the test documents the expected behaviortest.fixme()— the test itself needs work before it can usefully run
Retry Hooks: Running Code on Each Retry Attempt
When retrying fails, you sometimes need to know which attempt you're on — to log extra debugging information or to skip expensive setup on retries:
import { test } from "@playwright/test";
test("checkout flow", async ({ page }, testInfo) => {
if (testInfo.retry > 0) {
console.log(`Retry attempt ${testInfo.retry} for "${testInfo.title}"`);
}
await page.goto("/cart");
await page.click("[data-testid=checkout]");
await expect(page.locator(".order-confirmation")).toBeVisible();
});testInfo.retry is 0 on the first attempt, 1 on the first retry, and so on. Use it to add extra logging or screenshots on retry attempts to help diagnose the root cause.
The Retry Abuse Trap
Retries are a mitigation, not a fix. Teams that set retries: 5 and move on are hiding flakiness, not solving it.
Signs you're abusing retries:
- Tests routinely need 2-3 retries to pass
- Your CI flakiness dashboard shows high retry rates but "green" results
- Developers don't investigate tests that pass on retry
A healthier approach: track retry counts. Alert when a test fails more than N% of its first-attempt runs over the past week. Treat retry-dependent tests as technical debt.
// playwright.config.ts
export default defineConfig({
retries: 2,
// Always record traces for analysis
use: {
trace: "on-first-retry",
screenshot: "only-on-failure",
video: "on-first-retry",
},
});With traces enabled on first retry, every test that needed a retry produces a full trace showing exactly what happened — network requests, DOM changes, console errors. Investigate those traces weekly. Fix the root cause. Reduce your retry rate over time.
For monitoring test reliability in production — catching regressions as they happen rather than after they've been flaky for weeks — HelpMeTest provides continuous test execution with AI-powered failure analysis, distinguishing infrastructure noise from real regressions.