Test Retry Patterns: When to Retry Tests and When to Fix Them

Test Retry Patterns: When to Retry Tests and When to Fix Them

Retrying failing tests is a controversial topic. Critics say it hides flakiness and trains developers to ignore failures. Proponents say some level of retry is pragmatic in distributed test environments. Both are right — the answer is about when and how you retry.

The Case Against Retries

Automatic retries mask flaky tests. When CI reruns a test and it passes, the failure disappears from your metrics. Developers never see it. The flaky test stays flaky indefinitely.

Retries also slow down CI. If 5% of tests retry and each retry adds 30 seconds, your pipeline gets meaningfully slower. Multiply across many teams and CI instances, and this adds up.

Most importantly, a test that passes on the second try might be hiding a race condition, an infrastructure problem, or a real but intermittent bug. Making that failure invisible is dangerous.

The Case For Limited Retries

Not all test failures are created equal. Some failures are caused by:

  • Flaky external services in a staging environment
  • Race conditions in third-party infrastructure (not your code)
  • Transient network issues between test runners and services
  • Garbage collection pauses in JVMs causing timeouts

For these infrastructure-related failures, retrying once is pragmatic. The alternative is constant false alarms that developers learn to ignore — which is arguably worse than masking flakiness.

The key is transparency: retried tests should be visible in your metrics, not silently hidden.

Retry Strategies

Strategy 1: Retry Once, Log Always

The most defensible approach. Retry at most once, but always log the initial failure:

# conftest.py (pytest)
import pytest

@pytest.hookimpl(hookwrapper=True)
def pytest_runtest_makereport(item, call):
    outcome = yield
    report = outcome.get_result()
    
    if report.failed and hasattr(item, '_first_failure'):
        # This is a retry — mark as flaky
        report.flaky = True
        
    if report.failed and not hasattr(item, '_first_failure'):
        item._first_failure = True
# pytest-rerunfailures
pip install pytest-rerunfailures

<span class="hljs-comment"># Retry once, with 1 second delay
pytest tests/ --reruns 1 --reruns-delay 1

Strategy 2: Category-Based Retries

Apply retries only to tests tagged as potentially flaky for infrastructure reasons:

# Only retry tests marked as infrastructure-dependent
@pytest.mark.retry(max_attempts=3)
@pytest.mark.integration
def test_external_payment_webhook():
    # This test calls a real webhook endpoint that's sometimes slow
    ...
# conftest.py
def pytest_runtest_protocol(item, nextitem):
    retry_mark = item.get_closest_marker('retry')
    if retry_mark:
        max_attempts = retry_mark.kwargs.get('max_attempts', 3)
        item.max_reruns = max_attempts - 1

Strategy 3: Retry with Exponential Backoff

For tests that fail due to rate limits or eventual consistency:

import time
import pytest
from functools import wraps

def retry_with_backoff(max_attempts=3, base_delay=1):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(max_attempts):
                try:
                    return func(*args, **kwargs)
                except (AssertionError, Exception) as e:
                    if attempt == max_attempts - 1:
                        raise
                    delay = base_delay * (2 ** attempt)
                    print(f"\nTest failed (attempt {attempt + 1}/{max_attempts}). "
                          f"Retrying in {delay}s... Error: {e}")
                    time.sleep(delay)
        return wrapper
    return decorator

@retry_with_backoff(max_attempts=3, base_delay=1)
def test_cache_invalidation_propagates():
    # Eventual consistency — may take 1-3 seconds to propagate
    update_cache_key('user-123', {'name': 'Alice Updated'})
    
    cache_value = read_from_replica('user-123')
    assert cache_value['name'] == 'Alice Updated'

Framework-Specific Retry Configuration

Jest (JavaScript)

// jest.config.js
module.exports = {
  testTimeout: 10000,
  // Built-in retry (Jest 29+)
  retryTimes: 1,
  
  // Or per-test with jest-circus
};
// Per-test retry
jest.retryTimes(3);

test('async operation completes', async () => {
  // This test will retry up to 3 times
  await expect(asyncOperation()).resolves.toBe('success');
});

Cypress

// cypress.config.js
module.exports = {
  retries: {
    runMode: 2,    // Retry up to 2 times in CI
    openMode: 0    // No retries in interactive mode (you're watching)
  }
};

// Per-test
it('loads the dashboard', { retries: 3 }, () => {
  cy.visit('/dashboard');
  cy.get('[data-testid="dashboard"]').should('be.visible');
});

Playwright

// playwright.config.js
module.exports = {
  retries: process.env.CI ? 2 : 0,
  // Only retry in CI; locally you want to see failures immediately
};

GitHub Actions

# Retry the whole job (crude but sometimes necessary)
jobs:
  test:
    strategy:
      max-parallel: 1
    steps:
      - name: Run tests
        uses: nick-fields/retry@v2
        with:
          timeout_minutes: 30
          max_attempts: 3
          command: npm test

Measuring Retry Impact

Track retry metrics to understand your flakiness burden:

# Generate retry report from test results
import json

def analyze_retries(junit_xml_path):
    """Parse JUnit XML and find retried tests."""
    import xml.etree.ElementTree as ET
    
    tree = ET.parse(junit_xml_path)
    root = tree.getroot()
    
    retried = []
    for testcase in root.iter('testcase'):
        # pytest-rerunfailures adds rerun count as attribute
        reruns = int(testcase.get('reruns', 0))
        if reruns > 0:
            retried.append({
                'name': testcase.get('name'),
                'reruns': reruns,
                'final_status': 'passed' if testcase.find('failure') is None else 'failed'
            })
    
    print(f"Tests that required retries: {len(retried)}")
    for t in retried:
        print(f"  {t['reruns']} retry(ies), final={t['final_status']}: {t['name']}")
    
    # Alert if retry rate exceeds threshold
    total_tests = int(root.get('tests', 0))
    retry_rate = len(retried) / total_tests if total_tests else 0
    if retry_rate > 0.05:
        print(f"WARNING: Retry rate {retry_rate:.1%} exceeds 5% threshold")

Set up a dashboard showing retry trends over time. A rising retry rate means your test infrastructure is degrading.

When Retries Are Never Acceptable

Some failures should never be retried:

Deterministic business logic tests. If calculate_tax(100.00) returns 8.00 instead of 7.50, retrying won't help. The code is wrong.

Authorization failures. If a user can access a resource they shouldn't, retrying will just confirm the security hole.

Data integrity violations. If an operation creates a duplicate record, retrying might create another one.

Any test in a local development environment. Retries locally mask bugs you should be fixing immediately. Disable retries in development mode.

// Never retry locally
const retries = process.env.CI ? 2 : 0;

The Right Mental Model

Think of retries as a safety valve, not a solution. They're for the gap between "we know this is flaky" and "we've fixed the root cause."

The correct process:

  1. Test fails → retry once automatically
  2. Retry logged → flaky test tracking system records this failure
  3. Threshold exceeded → alert fires (e.g., 3 retries in a week for one test)
  4. Team investigates → root cause found, test fixed or quarantined
  5. Fix verified → test no longer retries

Without step 2-4, retries are just hiding problems. With them, retries are a reasonable buffer while you work through your flakiness backlog.

The goal is a test suite with zero retries — not because you disabled them, but because your tests are stable enough not to need them.

Read more