Flagsmith SDK Testing: Unit, Integration, and E2E Patterns

Flagsmith SDK Testing: Unit, Integration, and E2E Patterns

Flagsmith is an open-source feature flag and remote configuration platform supporting both cloud and self-hosted deployments. Testing Flagsmith integrations well means covering the SDK initialization, flag evaluation, identity trait propagation, and remote config retrieval — at every layer of your test pyramid. This guide walks through concrete patterns for each.

Flagsmith's Testing Surface

Flagsmith adds these testable behaviors to your application:

  • Flag checksisFeatureEnabled('my-flag') returns true or false
  • Remote configgetFeatureValue('button-color') returns a string/JSON value
  • Identity traits — user attributes that influence flag targeting
  • Multi-environment — development, staging, production environments with different flag states
  • SDK initialization — synchronous vs. asynchronous initialization and fallback behavior

Each deserves dedicated tests.

Unit Testing with Mocked Flagsmith SDK

Python: Mocking the Flagsmith Client

from unittest.mock import MagicMock, patch
import pytest

@pytest.fixture
def mock_flagsmith():
    with patch('myapp.services.flagsmith_client') as mock_client:
        mock_flags = MagicMock()
        mock_client.get_environment_flags.return_value = mock_flags
        yield mock_flags

def test_shows_beta_dashboard_when_flag_enabled(mock_flagsmith):
    mock_flagsmith.is_feature_enabled.return_value = True
    
    service = DashboardService()
    result = service.get_dashboard_type()
    
    assert result == 'beta-dashboard'

def test_shows_standard_dashboard_when_flag_disabled(mock_flagsmith):
    mock_flagsmith.is_feature_enabled.return_value = False
    
    service = DashboardService()
    result = service.get_dashboard_type()
    
    assert result == 'standard-dashboard'

JavaScript: Mocking the Flagsmith Module

jest.mock('flagsmith', () => ({
  init: jest.fn(),
  hasFeature: jest.fn(),
  getValue: jest.fn(),
  identify: jest.fn(),
}));

import flagsmith from 'flagsmith';

describe('FeatureService', () => {
  beforeEach(() => {
    jest.clearAllMocks();
  });

  it('enables dark mode when flag is on', async () => {
    flagsmith.hasFeature.mockReturnValue(true);

    const service = new FeatureService();
    const theme = service.getTheme();

    expect(theme).toBe('dark');
    expect(flagsmith.hasFeature).toHaveBeenCalledWith('dark-mode');
  });

  it('defaults to light mode when flag is off', () => {
    flagsmith.hasFeature.mockReturnValue(false);

    const service = new FeatureService();
    const theme = service.getTheme();

    expect(theme).toBe('light');
  });
});

Always test both variations — flag on and flag off. Missing either means half the toggle lifecycle is untested.

Testing Remote Configuration

Remote config retrieval needs tests for both the success path and degraded states.

Testing Config Values

def test_button_text_from_remote_config(mock_flagsmith):
    mock_flagsmith.get_feature_value.return_value = 'Get Started Free'
    
    component = CTAComponent()
    text = component.get_button_text()
    
    assert text == 'Get Started Free'
    mock_flagsmith.get_feature_value.assert_called_once_with('cta-button-text')

def test_button_text_falls_back_to_default_when_config_missing(mock_flagsmith):
    mock_flagsmith.get_feature_value.return_value = None
    
    component = CTAComponent()
    text = component.get_button_text()
    
    assert text == 'Sign Up'  # default value

Testing JSON Remote Config

Flagsmith supports JSON values in remote config. Test JSON parsing:

it('parses JSON remote config correctly', () => {
  flagsmith.getValue.mockReturnValue(JSON.stringify({
    primaryColor: '#5aff28',
    secondaryColor: '#000000',
    buttonRadius: 8,
  }));

  const theme = ThemeService.loadFromFlagsmith();

  expect(theme.primaryColor).toBe('#5aff28');
  expect(theme.buttonRadius).toBe(8);
});

it('handles malformed JSON gracefully', () => {
  flagsmith.getValue.mockReturnValue('{invalid json}');

  const theme = ThemeService.loadFromFlagsmith();

  // Should return safe defaults, not throw
  expect(theme).toEqual(ThemeService.defaults);
});

Testing Identity and Trait Propagation

Flagsmith's identity-based targeting depends on traits being passed correctly to the SDK. If traits aren't propagated, targeting rules silently fail.

describe('Identity propagation', () => {
  it('identifies user before checking flags', async () => {
    const user = { id: 'user-123', plan: 'pro', country: 'US' };

    await FeatureService.initForUser(user);

    expect(flagsmith.identify).toHaveBeenCalledWith('user-123', {
      plan: 'pro',
      country: 'US',
    });
  });

  it('checks flags after identification', async () => {
    const callOrder = [];
    flagsmith.identify.mockImplementation(() => callOrder.push('identify'));
    flagsmith.hasFeature.mockImplementation(() => {
      callOrder.push('hasFeature');
      return true;
    });

    await FeatureService.initForUser({ id: 'user-123' });
    FeatureService.isProDashboardEnabled();

    expect(callOrder).toEqual(['identify', 'hasFeature']);
  });
});

Integration Testing Against a Local Flagsmith Instance

For integration tests, run Flagsmith locally:

# docker-compose.test.yml
services:
  flagsmith-api:
    image: flagsmith/flagsmith:latest
    ports:
      - "8000:8000"
    environment:
      DJANGO_ALLOWED_HOSTS: "*"
      DATABASE_URL: postgres://flagsmith:password@db/flagsmith

  db:
    image: postgres:15
    environment:
      POSTGRES_DB: flagsmith
      POSTGRES_USER: flagsmith
      POSTGRES_PASSWORD: password

Seeding Test Flags

# tests/conftest.py
import requests
import pytest

FLAGSMITH_API = "http://localhost:8000/api/v1"
ADMIN_KEY = "test-admin-key"

@pytest.fixture(scope='session')
def setup_test_flags():
    # Create test project
    project = requests.post(
        f"{FLAGSMITH_API}/projects/",
        headers={"Authorization": f"Token {ADMIN_KEY}"},
        json={"name": "Test Project", "organisation": 1}
    ).json()

    # Create flag
    requests.post(
        f"{FLAGSMITH_API}/projects/{project['id']}/features/",
        headers={"Authorization": f"Token {ADMIN_KEY}"},
        json={"name": "new-checkout", "initial_value": "false", "default_enabled": False}
    )

    yield project['id']

Testing SDK Initialization Edge Cases

Initialization failures are often the source of production incidents. Test them explicitly.

Testing Async Initialization

describe('SDK initialization', () => {
  it('handles slow network gracefully', async () => {
    flagsmith.init.mockImplementation(() => 
      new Promise(resolve => setTimeout(resolve, 5000))
    );

    const service = new FeatureService();
    
    // Should not throw while waiting
    const theme = service.getTheme(); // returns default before init completes
    expect(theme).toBe('light'); // safe default
  });

  it('handles network failure gracefully', async () => {
    flagsmith.init.mockRejectedValue(new Error('Network error'));

    const service = new FeatureService();
    await service.init(); // should catch the error

    // Should still work with defaults
    const isEnabled = service.isFeatureEnabled('dark-mode');
    expect(isEnabled).toBe(false); // safe default
  });
});

Testing Cache Behavior

it('uses cached flags when SDK is offline', () => {
  // Simulate cached state from previous session
  localStorage.setItem('flagsmith', JSON.stringify({
    flags: { 'dark-mode': { enabled: true, value: null } }
  }));

  // SDK fails to connect
  flagsmith.init.mockRejectedValue(new Error('Offline'));

  const service = new FeatureService();
  expect(service.isFeatureEnabled('dark-mode')).toBe(true); // uses cache
});

End-to-End Testing with HelpMeTest

E2E tests validate that flag changes reach users correctly. HelpMeTest's browser-based testing works well here:

As a logged-in pro user
Navigate to /dashboard
Verify the pro features panel is visible
Click on advanced analytics
Verify the chart loads with data

Set up parallel test runs — one with the flag enabled, one disabled — to verify both paths work before each release.

Health check pattern — run your critical E2E flows on a 5-minute schedule. If a Flagsmith API outage causes flag evaluation to fail, your defaults will kick in. The health check will tell you if the defaults are acceptable or if a monitoring alert is needed.

CI/CD Integration

Per-Environment SDK Keys

Store SDK keys per environment in CI secrets:

env:
  FLAGSMITH_ENVIRONMENT_KEY: ${{ secrets.FLAGSMITH_CI_KEY }}

Use a dedicated CI environment in Flagsmith where flags are set to known states.

Flag State Validation Script

# scripts/validate-flags.py
"""Ensure all flags referenced in code exist in Flagsmith"""
import ast
import glob
import requests

def extract_flag_refs(directory):
    flags = set()
    for path in glob.glob(f"{directory}/**/*.py", recursive=True):
        with open(path) as f:
            tree = ast.parse(f.read())
        for node in ast.walk(tree):
            if isinstance(node, ast.Call):
                if hasattr(node.func, 'attr') and node.func.attr in ('is_feature_enabled', 'get_feature_value'):
                    if node.args:
                        flags.add(node.args[0].s)
    return flags

def get_flagsmith_flags(api_key, env_key):
    response = requests.get(
        "https://edge.api.flagsmith.com/api/v1/flags/",
        headers={"X-Environment-Key": env_key}
    )
    return {f['feature']['name'] for f in response.json()}

code_flags = extract_flag_refs('./src')
flagsmith_flags = get_flagsmith_flags(API_KEY, ENV_KEY)
missing = code_flags - flagsmith_flags

if missing:
    print(f"ERROR: Flags in code but not in Flagsmith: {missing}")
    exit(1)

Multi-Environment Testing Strategy

Flagsmith shines when you have distinct flag configurations per environment. Structure your testing accordingly:

Environment Flagsmith Env Flag Strategy
Local dev development All flags on
CI ci Explicit per-test
Staging staging Mirrors planned prod state
Production production Gradual rollout

Test staging before every release with the exact flag state you intend to use in production.

Summary

Testing Flagsmith integrations effectively means: mocking the SDK for unit tests covering both flag states, testing remote config retrieval including JSON parsing and fallback, explicitly testing identity and trait propagation, running integration tests against a local Flagsmith instance with seeded state, covering SDK initialization failure scenarios, and monitoring E2E flows in production. Build validation scripts into CI to catch missing flag definitions before they reach users.

Read more