Testing

Sanity Testing After Production Deployments: A Practical Strategy

HelpMeTest

22 May 2026 — 6 min read

You've deployed to production. The deployment logs say success. The pod is running. The load balancer shows healthy. Everything looks fine.

But "looks fine" and "works correctly" are different things. Production deployments fail in ways that staging doesn't predict: environment variable differences, CDN caching, database migration side effects, third-party service configuration, infrastructure that wasn't fully replicated in staging.

Sanity testing after deployment is the practice of deliberately verifying that the specific things you changed actually work in production before declaring the deployment successful. It's fast, targeted, and different from both smoke testing and regression testing.

What Makes Post-Deployment Sanity Testing Different

Smoke tests run before production deployment to decide whether to proceed. They're automated gates.

Sanity tests run after production deployment to confirm the change landed correctly. They're targeted verification, focused on exactly what was modified.

The distinction matters because:

You know what changed. A deployment contains specific commits. Sanity testing verifies those specific changes work in production — not the whole application.
Production is different from staging. Infrastructure configurations, secrets, CDN behavior, database state — all of these can cause a working staging build to fail in production. Sanity testing catches this.
Rollback windows are short. Most teams have a brief window after deployment where rollback is fast and safe. After that, data changes, caches warm, and rolling back gets complicated. Sanity testing within that window lets you catch problems while rollback is still clean.

The Sanity Testing Window

Define a sanity testing window for every deployment: a fixed period (typically 5–15 minutes) immediately after deployment completes, during which someone or something actively verifies the deployment.

If verification passes within that window: deployment is confirmed, the on-call rotation is stood down.

If verification fails within that window: rollback is initiated while it's still fast.

If the window passes without verification: treat the deployment as unverified and increase monitoring sensitivity.

This structure prevents the common failure mode of "the deployment is done, everyone moves on, and a silent failure goes unnoticed for hours."

What to Verify

The content of your sanity test depends entirely on what the deployment changed. For each deployment, identify:

1. The changed features. If the deployment fixed a bug in the checkout flow, verify the checkout flow works.

2. The affected data. If a migration ran, verify the affected tables look correct and the application is reading from them correctly.

3. The touched integrations. If the deployment changed how you interact with a third-party API, verify that integration is working.

4. The performance characteristics. If a query was optimized, spot-check response times.

A sanity test for a deployment that fixed a coupon code bug might verify:

Apply a valid coupon at checkout — discount is correct
Apply an expired coupon — appropriate error message appears
Apply a coupon below the minimum order amount — appropriate error message appears
Complete an order with a coupon — order total in the database reflects the discount
Coupon usage count increments correctly after redemption

This takes 5–10 minutes. It's focused on the change, not the entire application.

Structuring Your Sanity Checks

Automated Checks (Run Immediately)

Some verification can run automatically right after deployment:

// tests/sanity/post-deploy.spec.js
// These run automatically after every production deploy

test('coupon API endpoint accepts valid codes', async ({ request }) => {
  const response = await request.post('/api/coupons/validate', {
    data: { code: process.env.SANITY_TEST_COUPON_CODE, cart_total: 100 }
  });
  expect(response.status()).toBe(200);
  const body = await response.json();
  expect(body.discount_type).toBe('percentage');
  expect(body.discount_value).toBe(10);
});

test('coupon API rejects expired codes', async ({ request }) => {
  const response = await request.post('/api/coupons/validate', {
    data: { code: process.env.SANITY_TEST_EXPIRED_COUPON, cart_total: 100 }
  });
  expect(response.status()).toBe(400);
  const body = await response.json();
  expect(body.error).toBe('coupon_expired');
});

These run in under a minute and give immediate signal.

Manual Spot Checks (Run Within 5 Minutes)

Some things are harder to automate correctly or are high-stakes enough that a human should verify them:

Critical financial transactions (verify with a test credit card)
Email sending (trigger an email and check delivery)
Complex UI interactions that depend on visual correctness
Admin flows used by your team

Create a deployment verification checklist that a developer or QA engineer walks through after high-risk deployments:

## Deployment Verification: Checkout Coupon Fix

Verify in production after deployment:

- [ ] Log in with test account
- [ ] Add item to cart (total > $50)
- [ ] Apply coupon code TESTDISCOUNT10
- [ ] Confirm 10% discount applied to subtotal
- [ ] Apply coupon code EXPIREDTEST
- [ ] Confirm "Coupon has expired" error message
- [ ] Complete checkout with TESTDISCOUNT10
- [ ] Verify order in admin shows discounted total
- [ ] Check Stripe dashboard — charge matches discounted total

Rollback if: any of the above fail, or if error rates spike in Datadog.

Monitoring-Based Verification (Ongoing)

Beyond active testing, watch your observability tools in the minutes after deployment:

Error rate on affected endpoints (should not spike)
Latency on affected endpoints (should not degrade)
Specific business metrics (e.g., successful checkout rate should hold steady)
Database query performance (migrations can cause unexpected slowdowns)

Set up a dashboard view that shows these metrics for the 30 minutes surrounding any production deployment.

Sanity Testing for Different Deployment Types

Bug Fixes

Focus on the specific bug that was fixed. Verify:

The bug no longer reproduces
The fix didn't break adjacent behavior
Edge cases around the bug still work correctly

Feature Flags

If the deployment enables a feature flag, verify:

The feature is accessible to users with the flag enabled
The feature is not accessible to users without the flag
The feature works correctly end-to-end

Database Migrations

Migrations require careful sanity testing:

-- Before deployment: record row counts for affected tables
SELECT COUNT(*) FROM orders WHERE coupon_id IS NOT NULL;
-- Result: 4821

-- After deployment: verify data integrity
SELECT COUNT(*) FROM orders WHERE coupon_id IS NOT NULL;
-- Should still be 4821 (or higher if new orders came in, never lower)

-- Verify the migrated data looks correct
SELECT coupon_id, discount_amount FROM orders
WHERE coupon_id IS NOT NULL
LIMIT 10;
-- Spot-check that discount_amount values look right

Infrastructure Changes

Infrastructure deployments (new nodes, updated configs, certificate rotations) require verifying:

Application can still connect to all dependencies (database, cache, message queue)
SSL certificates are valid and not expired
DNS resolution is working correctly
Load balancing is distributing traffic correctly

Rollback Triggers

Define explicit criteria for rolling back before you deploy. Don't decide in the heat of the moment.

Roll back immediately if:

Error rate on any core endpoint exceeds 1% for more than 2 minutes
The specific bug that was fixed is still reproducible
Data integrity issues are found (rows missing, values incorrect)
Payment processing is broken in any way

Monitor and decide within 15 minutes if:

Latency increased but is still within acceptable range
Error rate is slightly elevated but trending down
A non-critical feature is broken

No rollback needed if:

All sanity checks pass
Error rates are within normal baseline
No user-reported issues

Using HelpMeTest for Post-Deployment Sanity

HelpMeTest supports running targeted test suites triggered by deployment events. After a production deployment completes, your CI/CD pipeline can trigger a specific sanity suite:

# Trigger sanity tests after deployment
curl -X POST https://api.helpmetest.com/v1/runs \
  -H <span class="hljs-string">"Authorization: Bearer $HELPMETEST_API_KEY" \
  -H <span class="hljs-string">"Content-Type: application/json" \
  -d <span class="hljs-string">'{
    "suite": "sanity-coupon-fix",
    "environment": "production",
    "notify": "slack:#deployments",
    "timeout": 300
  }'

Tests written in plain English are fast to update when the behavior changes:

Navigate to the checkout page
Add product "Test Item" to cart
Enter coupon code "TESTDISCOUNT10" in the coupon field
Click "Apply Coupon"
Verify the discount amount shows "-$10.00"
Verify the order total is correct

The self-healing capabilities handle minor UI changes — if the coupon input field's selector changes, HelpMeTest updates it automatically rather than breaking the test.

Making Sanity Testing a Team Habit

The biggest challenge with post-deployment sanity testing isn't tooling — it's making it a habit that actually happens on every deployment.

Build it into your deployment process so it's not optional:

Deployment checklist. Every deployment has a written checklist. No deployment is "done" until the checklist is signed off.
Dedicated time. Block 10–15 minutes after every production deployment for verification. Don't schedule calls or context-switches during this window.
Ownership. The person who wrote the code is responsible for verifying it in production. This creates the right incentive.
Incident post-mortems. When something slips through, document what the sanity test should have caught and add it to the checklist.

Done consistently, post-deployment sanity testing changes the failure mode from "we found out from users something was broken" to "we caught it within 10 minutes and rolled back before anyone noticed."