A/B Testing Guide: How to Test Experiments Without Breaking Your Application
A/B testing (split testing) lets you run controlled experiments — showing variant A to one group and variant B to another — to measure which performs better. From an engineering perspective, the challenge is not the statistics but the testability: experiments must be deterministic, isolated, and not interfere with your automated test suite.
The Core Concepts
Experiment: A controlled change with defined variants (A = control, B = treatment).
Assignment: Deterministically mapping a user/session to a variant. Usually hash-based:
function assignVariant(userId, experimentKey, variants = ['control', 'treatment']) {
const hash = murmurhash(`${userId}:${experimentKey}`)
const index = hash % variants.length
return variants[index]
}Exposure: The moment a user sees the variant. Record this to calculate denominators.
Conversion: The event you're measuring (click, signup, purchase).
Testing Challenges
A/B testing introduces non-determinism into your application. The same user action can have different outcomes depending on which variant they're assigned. This breaks tests that don't control for variant assignment.
Solutions:
- Force a specific variant in tests via override API or cookie
- Mock the assignment function to always return a known variant
- Test each variant independently — one test file per variant
Testing with Forced Variants
Most A/B testing frameworks support forcing a variant via URL parameter, cookie, or API.
// GrowthBook: force variant via cookie
document.cookie = 'gb_force_exp_button_color=treatment'
// Optimizely: force variation via URL
// ?optimizely_x=123456
// Statsig: override in test
client.overrideGate('new_checkout', true)
client.overrideExperiment('button_color', { color: 'green' })Unit Testing Experiment Logic
// experiment.js
import { GrowthBook } from '@growthbook/growthbook'
export function getCheckoutButtonColor(gb) {
const result = gb.run({ key: 'button-color', variations: ['blue', 'green'] })
return result.value
}
// experiment.test.js
import { GrowthBook } from '@growthbook/growthbook'
import { getCheckoutButtonColor } from './experiment'
function createGrowthBook(forcedVariations = {}) {
const gb = new GrowthBook({ forcedVariations })
gb.setFeatures({ 'button-color': { defaultValue: 'blue' } })
return gb
}
test('returns blue for control variant', () => {
const gb = createGrowthBook({ 'button-color': 0 })
expect(getCheckoutButtonColor(gb)).toBe('blue')
})
test('returns green for treatment variant', () => {
const gb = createGrowthBook({ 'button-color': 1 })
expect(getCheckoutButtonColor(gb)).toBe('green')
})Integration Testing Both Variants
// checkout.test.js
describe('Checkout page', () => {
describe('control variant (blue button)', () => {
beforeEach(() => {
// Force control variant
cy.setCookie('gb_force_button-color', '0')
})
it('shows blue CTA button', () => {
cy.visit('/checkout')
cy.get('[data-testid="cta-button"]').should('have.css', 'background-color', 'rgb(0, 0, 255)')
})
it('completes purchase flow with blue button', () => {
cy.visit('/checkout')
cy.get('[data-testid="cta-button"]').click()
cy.url().should('include', '/confirmation')
})
})
describe('treatment variant (green button)', () => {
beforeEach(() => {
cy.setCookie('gb_force_button-color', '1')
})
it('shows green CTA button', () => {
cy.visit('/checkout')
cy.get('[data-testid="cta-button"]').should('have.css', 'background-color', 'rgb(0, 255, 0)')
})
it('completes purchase flow with green button', () => {
cy.visit('/checkout')
cy.get('[data-testid="cta-button"]').click()
cy.url().should('include', '/confirmation')
})
})
})Testing Experiment Exposure Tracking
Experiments are useless without exposure tracking. Test that exposures are recorded correctly.
test('records exposure when experiment is evaluated', async () => {
const trackMock = vi.fn()
const gb = new GrowthBook({
trackingCallback: trackMock,
forcedVariations: { 'button-color': 1 }
})
renderCheckout({ gb })
expect(trackMock).toHaveBeenCalledWith(
expect.objectContaining({ key: 'button-color' }),
expect.objectContaining({ value: 'green' })
)
})Testing Statistical Significance
For server-side A/B tests, validate that your significance calculation is correct:
// stats.js
export function calculateZScore(controlConversions, controlVisitors, treatmentConversions, treatmentVisitors) {
const p1 = controlConversions / controlVisitors
const p2 = treatmentConversions / treatmentVisitors
const p = (controlConversions + treatmentConversions) / (controlVisitors + treatmentVisitors)
const se = Math.sqrt(p * (1 - p) * (1/controlVisitors + 1/treatmentVisitors))
return (p2 - p1) / se
}
// stats.test.js
test('calculates correct z-score for significant result', () => {
// 5% vs 7% conversion rate, 1000 visitors each
const z = calculateZScore(50, 1000, 70, 1000)
expect(z).toBeCloseTo(1.96, 1) // ~95% confidence
})
test('calculates correct z-score for non-significant result', () => {
const z = calculateZScore(50, 100, 52, 100)
expect(Math.abs(z)).toBeLessThan(1.65) // below 90% confidence
})Preventing Experiment Interference
Multiple simultaneous experiments can interfere. Test for:
test('user in experiment A does not bleed into experiment B', () => {
const userId = 'user-123'
const expA = assignVariant(userId, 'experiment-a')
const expB = assignVariant(userId, 'experiment-b')
// Assignments should be independent
// Run 1000 users and verify distribution in each experiment
const results = { 'experiment-a': {}, 'experiment-b': {} }
for (let i = 0; i < 1000; i++) {
const id = `user-${i}`
const a = assignVariant(id, 'experiment-a')
const b = assignVariant(id, 'experiment-b')
results['experiment-a'][a] = (results['experiment-a'][a] || 0) + 1
results['experiment-b'][b] = (results['experiment-b'][b] || 0) + 1
}
// Both experiments should have ~50/50 split
const aRatio = results['experiment-a']['control'] / 1000
const bRatio = results['experiment-b']['control'] / 1000
expect(aRatio).toBeCloseTo(0.5, 1)
expect(bRatio).toBeCloseTo(0.5, 1)
})CI/CD Considerations
- Never run A/B tests in CI without explicit variant forcing — flaky tests result
- Add a
TEST_VARIANT_OVERRIDEenv var or cookie mechanism to force variants - Document which variant each test targets at the top of the test file
Summary
The key engineering principle: A/B tests must be deterministic in your test suite. Force variants explicitly in every test. Test each variant as a separate scenario. Verify exposure tracking. The statistics layer is secondary — the application behavior under each variant is what your automated tests protect.