Testing Webhook Retry Logic: Failures, Backoffs, and Idempotency

Testing Webhook Retry Logic: Failures, Backoffs, and Idempotency

Webhook providers retry failed deliveries. Your handler will receive duplicate events. Testing retry behavior — and ensuring your handler survives it — is as important as testing the happy path.

This guide covers testing webhook retry logic from both sides: testing your handler's idempotency under retries, and testing your own retry delivery system if you send webhooks.

Understanding Provider Retry Behavior

Major providers retry on any non-2xx response:

Provider Retries Backoff Strategy
Stripe 3 retries over 3 days Exponential (1h, 6h, 72h)
GitHub 3 retries No backoff (immediate)
Shopify Until 19th attempt Exponential over 48h
Svix Configurable Exponential with jitter

The key implication: if your handler returns 500, it will get called again. If it's not idempotent, you get double fulfillments, duplicate emails, or corrupted data.

Testing Idempotent Handlers

The most important webhook test category: send the same payload twice, assert the outcome is identical to sending once.

describe('payment webhook idempotency', () => {
  beforeEach(async () => {
    await db.clear();
  });

  it('processes duplicate webhook only once', async () => {
    const payload = {
      id: 'evt_1234567890',  // Stripe event ID — unique per event
      type: 'payment_intent.succeeded',
      data: { object: { id: 'pi_abc123', amount: 5000 } }
    };

    // First delivery
    const res1 = await request(app)
      .post('/webhooks/stripe')
      .send(JSON.stringify(payload));

    // Second delivery (retry simulation)
    const res2 = await request(app)
      .post('/webhooks/stripe')
      .send(JSON.stringify(payload));

    expect(res1.status).toBe(200);
    expect(res2.status).toBe(200);

    // Side effect should happen exactly once
    const orders = await db.query(
      'SELECT * FROM fulfilled_orders WHERE payment_id = $1',
      ['pi_abc123']
    );
    expect(orders.rows).toHaveLength(1);
  });
});

The standard approach is an idempotency_key table — store processed event IDs and skip re-processing:

async function handleStripeWebhook(event) {
  // Check if already processed
  const exists = await db.query(
    'SELECT id FROM processed_events WHERE event_id = $1',
    [event.id]
  );
  if (exists.rows.length > 0) {
    return { status: 'duplicate', skipped: true };
  }

  // Process event
  await processPayment(event.data.object);

  // Mark as processed (in same transaction)
  await db.query(
    'INSERT INTO processed_events (event_id, processed_at) VALUES ($1, NOW())',
    [event.id]
  );

  return { status: 'processed' };
}

Testing the Deduplication Table

Test the deduplication mechanism directly:

describe('event deduplication', () => {
  it('inserts event ID on first call', async () => {
    await handleStripeWebhook({ id: 'evt_new', type: 'payment_intent.succeeded', data: {} });

    const row = await db.query('SELECT * FROM processed_events WHERE event_id = $1', ['evt_new']);
    expect(row.rows).toHaveLength(1);
  });

  it('does not re-process on second call', async () => {
    const processSpy = jest.spyOn(paymentService, 'fulfill');

    await handleStripeWebhook({ id: 'evt_dup', type: 'payment_intent.succeeded', data: {} });
    await handleStripeWebhook({ id: 'evt_dup', type: 'payment_intent.succeeded', data: {} });

    expect(processSpy).toHaveBeenCalledTimes(1);
  });

  it('processes different event IDs independently', async () => {
    const processSpy = jest.spyOn(paymentService, 'fulfill');

    await handleStripeWebhook({ id: 'evt_001', type: 'payment_intent.succeeded', data: {} });
    await handleStripeWebhook({ id: 'evt_002', type: 'payment_intent.succeeded', data: {} });

    expect(processSpy).toHaveBeenCalledTimes(2);
  });
});

Testing Partial Failure and Recovery

What happens if your handler crashes halfway through? The provider retries. Test that partial state is handled:

it('handles partial failure correctly', async () => {
  // Simulate DB failure on first delivery
  let callCount = 0;
  const originalFulfill = paymentService.fulfill;
  paymentService.fulfill = async (payment) => {
    callCount++;
    if (callCount === 1) {
      throw new Error('Database connection lost');
    }
    return originalFulfill(payment);
  };

  const payload = { id: 'evt_partial', type: 'payment_intent.succeeded', data: {} };

  // First call fails — returns 500, provider will retry
  const res1 = await request(app).post('/webhooks/stripe').send(JSON.stringify(payload));
  expect(res1.status).toBe(500);

  // Second call (retry) should succeed
  const res2 = await request(app).post('/webhooks/stripe').send(JSON.stringify(payload));
  expect(res2.status).toBe(200);

  // Order fulfilled exactly once despite the failure + retry
  const orders = await db.query('SELECT * FROM orders WHERE payment_id = $1', ['evt_partial']);
  expect(orders.rows).toHaveLength(1);
});

Testing Your Own Retry Logic

If your service sends webhooks, test the retry behavior itself.

const axios = require('axios');
const MockAdapter = require('axios-mock-adapter');

describe('webhook delivery with retries', () => {
  let mock;

  beforeEach(() => {
    mock = new MockAdapter(axios);
  });

  afterEach(() => mock.restore());

  it('retries on 500 response', async () => {
    let attempts = 0;
    mock.onPost('https://customer.example.com/webhook').reply(() => {
      attempts++;
      return attempts < 3 ? [500, 'Server Error'] : [200, 'OK'];
    });

    await webhookDelivery.send({
      url: 'https://customer.example.com/webhook',
      payload: { event: 'order.created', id: 'ord_123' },
      maxRetries: 3
    });

    expect(attempts).toBe(3);
  });

  it('does not retry on 400 (client error)', async () => {
    let attempts = 0;
    mock.onPost('https://customer.example.com/webhook').reply(() => {
      attempts++;
      return [400, 'Bad Request'];
    });

    await webhookDelivery.send({
      url: 'https://customer.example.com/webhook',
      payload: { event: 'order.created' },
      maxRetries: 5
    });

    // 4xx = client error, don't retry
    expect(attempts).toBe(1);
  });

  it('respects exponential backoff timing', async () => {
    const deliveryTimes = [];
    mock.onPost('https://customer.example.com/webhook').reply(() => {
      deliveryTimes.push(Date.now());
      return deliveryTimes.length < 3 ? [500] : [200];
    });

    await webhookDelivery.send({
      url: 'https://customer.example.com/webhook',
      payload: {},
      maxRetries: 3,
      initialDelay: 100  // 100ms for testing (not real 1h)
    });

    const gap1 = deliveryTimes[1] - deliveryTimes[0];
    const gap2 = deliveryTimes[2] - deliveryTimes[1];

    // Second gap should be ~2x first gap (exponential)
    expect(gap2).toBeGreaterThan(gap1 * 1.5);
  });
});

Testing Timeout Handling

Webhooks should have a delivery timeout. Test that slow endpoints are handled:

it('marks delivery failed on timeout', async () => {
  mock.onPost('https://slow.example.com/webhook').timeout();

  const result = await webhookDelivery.send({
    url: 'https://slow.example.com/webhook',
    payload: { event: 'test' },
    timeout: 5000
  });

  expect(result.status).toBe('failed');
  expect(result.error).toMatch(/timeout/i);
});

End-to-End Retry Testing with HelpMeTest

For full integration testing, use HelpMeTest to verify retry behavior against a staging environment:

*** Test Cases ***
Webhook Retry Delivers After Initial Failure
    # Trigger webhook that will initially return 500
    Set Test Config    webhook_fail_count=2
    ${result}=    POST    ${BASE_URL}/api/trigger-test-webhook
    Sleep    5s    # Wait for retry attempts

    # Verify eventually delivered
    ${events}=    GET    ${BASE_URL}/api/delivered-events
    ${count}=     Get Length    ${events.json()}
    Should Be Equal As Integers    ${count}    1

The Retry Testing Checklist

  • Same event ID processed exactly once (idempotency)
  • Handler returns 500 on failure (so provider retries)
  • Partial failures don't leave corrupt state
  • Retry delivery uses exponential backoff
  • 4xx responses don't trigger retries (not the receiver's fault)
  • Delivery timeout is enforced
  • Retry exhaustion sends alert or queues for manual review

Summary

Retry logic is where most webhook bugs hide. A handler that works perfectly on first delivery can cause serious problems when retried: double charges, duplicate notifications, or inconsistent state. Test idempotency explicitly, test failure and recovery paths, and test your own retry delivery if you're on the sending side.

The pattern is simple: store processed event IDs, check before processing, skip duplicates. Test this rigorously and your webhook handler will survive any retry storm.

Read more