Debugging Webhook Failures in CI: A Systematic Approach
Webhook failures in CI are some of the most frustrating bugs to debug. The test passes locally but fails in CI. The webhook arrives but nothing happens. Logs show 200 OK but the feature doesn't work.
This guide gives you a systematic approach for diagnosing and fixing webhook failures in CI pipelines.
The Debugging Hierarchy
Start with the simplest explanation and work up:
- Is the webhook arriving? — Check request logs at your endpoint
- Is the signature valid? — Log verification result before any processing
- Is the payload correct? — Log the deserialized payload
- Did the handler run? — Add entry/exit logs to your handler
- Did side effects complete? — Log database writes, queue publishes, API calls
- Did the response go out? — Log the status code returned
If you can answer each question from your logs, you can pinpoint any failure.
Setting Up Webhook Request Logging
Add structured logging at the entry point of every webhook handler:
app.post('/webhooks/:provider', async (req, res) => {
const requestId = crypto.randomUUID();
const logger = createLogger({ requestId, provider: req.params.provider });
logger.info('webhook_received', {
headers: {
'content-type': req.headers['content-type'],
'x-request-id': req.headers['x-request-id'],
// Include provider-specific sig headers
'stripe-signature': req.headers['stripe-signature'] ? '[present]' : '[missing]',
'x-hub-signature-256': req.headers['x-hub-signature-256'] ? '[present]' : '[missing]',
},
body_size: req.body?.length || 0,
timestamp: new Date().toISOString()
});
try {
const result = await handleWebhook(req.params.provider, req);
logger.info('webhook_processed', { result });
res.json({ received: true, requestId });
} catch (error) {
logger.error('webhook_failed', {
error: error.message,
stack: error.stack
});
res.status(500).json({ error: 'Processing failed', requestId });
}
});In CI, always print these logs even on success — they're your debug trail if a later stage fails.
Capturing Webhook Payloads in CI
The hardest part of debugging webhook CI failures is that you can't easily replay them. Solve this by persisting captured payloads:
// Capture middleware — write to file in CI
app.post('/webhooks/*', (req, res, next) => {
if (process.env.CI) {
const filename = `webhook-captures/${Date.now()}-${req.path.replace(/\//g, '_')}.json`;
fs.writeFileSync(filename, JSON.stringify({
path: req.path,
headers: req.headers,
body: req.body
}, null, 2));
}
next();
});In your GitHub Actions workflow, upload captured webhooks as artifacts:
- name: Run webhook integration tests
run: npm test -- --grep "webhook"
- name: Upload webhook captures on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: webhook-captures
path: webhook-captures/
retention-days: 7When a CI run fails, download the artifacts and replay the exact payload locally.
Replaying Captured Payloads
Once you have a captured payload, replay it against your local server:
# Start your server
node server.js &
<span class="hljs-comment"># Replay the captured webhook
<span class="hljs-built_in">cat webhook-captures/1234567890-webhooks_stripe.json <span class="hljs-pipe">| \
node scripts/replay-webhook.js// scripts/replay-webhook.js
const fs = require('fs');
const http = require('http');
const capture = JSON.parse(fs.readFileSync('/dev/stdin', 'utf8'));
const req = http.request({
hostname: 'localhost',
port: 3000,
path: capture.path,
method: 'POST',
headers: {
...capture.headers,
host: 'localhost:3000'
}
}, (res) => {
console.log(`Status: ${res.statusCode}`);
res.pipe(process.stdout);
});
req.write(JSON.stringify(capture.body));
req.end();Common CI-Specific Failure Patterns
1. Signature Verification Fails Due to Body Parsing
Symptom: Signature valid locally, invalid in CI.
Cause: Some CI environments or reverse proxies normalize request bodies. The raw bytes sent don't match what your handler reads.
Fix: Log the raw body bytes in CI and compare with the signature input:
app.post('/webhooks/stripe', express.raw({ type: '*/*' }), (req, res) => {
if (process.env.CI) {
console.log('Raw body length:', req.body.length);
console.log('Raw body hash:', crypto.createHash('md5').update(req.body).digest('hex'));
}
// ... verify signature
});2. Environment Variables Missing
Symptom: undefined webhook secret, signature always fails.
Fix: Add an env var check at startup and in your test setup:
// In your webhook handler
const webhookSecret = process.env.WEBHOOK_SECRET;
if (!webhookSecret) {
throw new Error('WEBHOOK_SECRET environment variable is required');
}# In GitHub Actions
- name: Run webhook tests
run: npm test
env:
WEBHOOK_SECRET: ${{ secrets.WEBHOOK_SECRET }}
# Fail loudly if secret is missing3. Database Not Ready When Webhook Arrives
Symptom: Handler returns 500, logs show connection refused.
Fix: Add a readiness check before starting tests:
- name: Wait for database
run: |
until pg_isready -h localhost -p 5432; do
echo "Waiting for database..."
sleep 1
done
- name: Run webhook tests
run: npm test4. Webhook URL Unreachable in CI
Symptom: Integration tests that wait for webhook callback never complete.
Cause: CI environment has no public URL for providers to call back.
Fix: Use smee.io as a proxy to forward webhook events to CI:
- name: Start smee proxy
run: npx smee-client --url https://smee.io/YOUR_CHANNEL --path /webhooks/github --port 3000 &
- name: Run integration tests
run: npm run test:integration
env:
WEBHOOK_PROXY_URL: https://smee.io/YOUR_CHANNELOr use HelpMeTest's proxy to expose your local test server:
helpmetest proxy start localhost:3000
# Use the provided public URL as your webhook endpoint5. Race Condition: Test Asserts Before Webhook Arrives
Symptom: Test passes locally but fails intermittently in CI (flaky).
Cause: CI is slower; webhook processing takes longer than expected.
Fix: Never use fixed sleeps. Use polling with a timeout:
// Wrong
await new Promise(resolve => setTimeout(resolve, 2000));
const result = await db.getResult();
expect(result).toBeDefined();
// Right
const result = await waitForCondition(
() => db.getResult(),
{ timeout: 10000, interval: 200 }
);
expect(result).toBeDefined();Structured CI Test Output for Webhook Tests
Make webhook test failures self-documenting:
afterEach(async () => {
if (testFailed) {
console.log('=== WEBHOOK DEBUG INFO ===');
console.log('Received webhooks:', JSON.stringify(capturedWebhooks, null, 2));
console.log('Database state:', JSON.stringify(await db.dump(), null, 2));
console.log('Queue messages:', JSON.stringify(await queue.drain(), null, 2));
console.log('=========================');
}
});End-to-End Monitoring with HelpMeTest
For production monitoring of webhook endpoints, HelpMeTest can run continuous tests against your staging environment:
*** Test Cases ***
Webhook Endpoint Returns 400 For Invalid Signature
${payload}= Set Variable {"type": "test.event"}
${response}= POST ${WEBHOOK_URL}/stripe
... headers={"stripe-signature": "v1=invalid"}
... data=${payload}
Should Be Equal As Integers ${response.status_code} 400
Webhook Endpoint Returns 200 For Valid Payload
${payload}= Get File fixtures/stripe_payment_succeeded.json
${sig}= Build Stripe Signature ${payload} ${STRIPE_WEBHOOK_SECRET}
${response}= POST ${WEBHOOK_URL}/stripe
... headers={"stripe-signature": "${sig}"}
... data=${payload}
Should Be Equal As Integers ${response.status_code} 200Run these as health checks on a schedule to detect regressions before they reach production.
The Webhook CI Debugging Checklist
When a webhook test fails in CI, work through this list:
- Is the webhook request actually being made? (Check request logs)
- Does the signature header exist? (Log header presence)
- Is the secret available in CI? (Check env vars)
- Is the raw body untampered? (Log body hash)
- Is the database ready? (Add readiness check)
- Is the webhook URL reachable? (Use smee.io or HelpMeTest proxy)
- Are you waiting long enough? (Use polling, not fixed sleeps)
- Did the handler complete? (Log entry/exit)
- Are captured payloads available? (Upload as CI artifacts)
Summary
Webhook failures in CI almost always fall into a few categories: signature verification issues from body parsing, missing environment variables, race conditions from fixed sleeps, or the webhook URL being unreachable. The fix for all of them is the same: add structured logging, capture payloads as CI artifacts, and use polling instead of sleeps.
Once you can replay captured payloads locally and have structured debug output in CI, webhook failures stop being mysterious and start being straightforward to fix.