OpenFeature Specification Testing: Vendor-Neutral Feature Flag Quality

OpenFeature Specification Testing: Vendor-Neutral Feature Flag Quality

OpenFeature is a CNCF specification that standardizes how applications interact with feature flag systems. Instead of coupling your code to LaunchDarkly, Unleash, or Flagsmith APIs, you code against the OpenFeature API — and swap providers without changing application code. But this abstraction layer introduces new testing requirements. You must test the OpenFeature API layer, your provider implementation, hooks, and context propagation — not just flag values.

Why OpenFeature Changes Your Testing Approach

With a direct SDK integration, you mock one specific client. With OpenFeature, you have a layered architecture:

Application Code
      ↓
OpenFeature Client (vendor-neutral API)
      ↓
Provider (LaunchDarkly / Unleash / Flagsmith adapter)
      ↓
Flag Management System

Each layer needs testing:

  1. Application layer — does your code respond correctly to flag values?
  2. Provider layer — does your provider implementation correctly translate OpenFeature API calls to vendor SDK calls?
  3. Hook layer — do your hooks run in the right order with correct data?
  4. Context layer — does evaluation context propagate correctly through the chain?

Unit Testing with the OpenFeature In-Memory Provider

OpenFeature's official SDKs include an in-memory provider for testing:

JavaScript/TypeScript

import {
  OpenFeature,
  InMemoryProvider,
  FlagValueType,
} from '@openfeature/server-sdk';

describe('Feature flag evaluation', () => {
  beforeAll(async () => {
    const provider = new InMemoryProvider({
      'new-checkout': {
        defaultVariant: 'off',
        variants: {
          on: true,
          off: false,
        },
        disabled: false,
      },
    });

    await OpenFeature.setProviderAndWait(provider);
  });

  it('evaluates boolean flag', async () => {
    const client = OpenFeature.getClient();
    const value = await client.getBooleanValue('new-checkout', false);
    expect(value).toBe(false);
  });
});

Java SDK

import dev.openfeature.sdk.*;
import dev.openfeature.contrib.providers.flagd.FlagdProvider;

@BeforeEach
void setUp() {
    Map<String, Flag<?>> flags = Map.of(
        "new-checkout", Flag.<Boolean>builder()
            .variant("on", true)
            .variant("off", false)
            .defaultVariant("off")
            .build()
    );
    InMemoryProvider provider = new InMemoryProvider(flags);
    OpenFeatureAPI.getInstance().setProvider(provider);
}

@Test
void testBooleanFlagDefaultFalse() {
    Client client = OpenFeatureAPI.getInstance().getClient();
    boolean result = client.getBooleanValue("new-checkout", false);
    assertFalse(result);
}

The in-memory provider gives you full control over flag state without network connections.

Testing All Flag Types

OpenFeature supports boolean, string, number, and object flags. Test all of them.

describe('All flag types', () => {
  beforeAll(async () => {
    const flags = {
      'dark-mode': {
        defaultVariant: 'enabled',
        variants: { enabled: true, disabled: false },
      },
      'button-color': {
        defaultVariant: 'primary',
        variants: { primary: '#5aff28', secondary: '#ffffff' },
      },
      'max-items': {
        defaultVariant: 'standard',
        variants: { standard: 10, premium: 50 },
      },
      'theme-config': {
        defaultVariant: 'default',
        variants: {
          default: { borderRadius: 8, fontScale: 1.0 },
          compact: { borderRadius: 4, fontScale: 0.9 },
        },
      },
    };

    await OpenFeature.setProviderAndWait(new InMemoryProvider(flags));
  });

  it('evaluates boolean flag', async () => {
    const client = OpenFeature.getClient();
    expect(await client.getBooleanValue('dark-mode', false)).toBe(true);
  });

  it('evaluates string flag', async () => {
    const client = OpenFeature.getClient();
    expect(await client.getStringValue('button-color', '#000000')).toBe('#5aff28');
  });

  it('evaluates number flag', async () => {
    const client = OpenFeature.getClient();
    expect(await client.getNumberValue('max-items', 5)).toBe(10);
  });

  it('evaluates object flag', async () => {
    const client = OpenFeature.getClient();
    const config = await client.getObjectValue('theme-config', {});
    expect(config).toMatchObject({ borderRadius: 8, fontScale: 1.0 });
  });
});

Testing Evaluation Details

OpenFeature's evaluation detail methods return rich metadata about how a flag was resolved. Test this for observability and debugging.

it('returns evaluation details with reason and variant', async () => {
  const client = OpenFeature.getClient();
  const details = await client.getBooleanDetails('dark-mode', false);

  expect(details.value).toBe(true);
  expect(details.variant).toBe('enabled');
  expect(details.reason).toBe('STATIC'); // InMemoryProvider returns STATIC
  expect(details.flagKey).toBe('dark-mode');
});

it('returns ERROR reason for unknown flag', async () => {
  const client = OpenFeature.getClient();
  const details = await client.getBooleanDetails('nonexistent-flag', false);

  expect(details.reason).toBe('ERROR');
  expect(details.errorCode).toBe('FLAG_NOT_FOUND');
  expect(details.value).toBe(false); // returns default
});

Testing Custom Provider Implementations

If you're writing an OpenFeature provider for an internal flag system, the OpenFeature spec defines conformance tests you must pass.

Provider Conformance Test Structure

// my-provider.spec.ts
import { MyProvider } from './my-provider';
import { OpenFeature, ProviderEvents } from '@openfeature/server-sdk';

describe('MyProvider conformance', () => {
  let provider: MyProvider;

  beforeEach(async () => {
    provider = new MyProvider({ serverUrl: 'http://localhost:9001' });
    await OpenFeature.setProviderAndWait(provider);
  });

  afterEach(async () => {
    await OpenFeature.clearProviders();
  });

  it('implements provider metadata', () => {
    expect(provider.metadata.name).toBeDefined();
    expect(typeof provider.metadata.name).toBe('string');
  });

  it('resolves boolean flags', async () => {
    const client = OpenFeature.getClient();
    const result = await client.getBooleanValue('test-flag', false);
    expect(typeof result).toBe('boolean');
  });

  it('returns FLAG_NOT_FOUND error for unknown flags', async () => {
    const client = OpenFeature.getClient();
    const details = await client.getBooleanDetails('unknown-flag-xyz', false);
    expect(details.errorCode).toBe('FLAG_NOT_FOUND');
  });

  it('emits READY event on successful initialization', async () => {
    const events: ProviderEvents[] = [];
    OpenFeature.addHandler(ProviderEvents.Ready, () => events.push(ProviderEvents.Ready));

    await provider.initialize({});
    expect(events).toContain(ProviderEvents.Ready);
  });

  it('emits ERROR event on initialization failure', async () => {
    const failingProvider = new MyProvider({ serverUrl: 'http://nonexistent:9001' });
    const events: ProviderEvents[] = [];

    OpenFeature.addHandler(ProviderEvents.Error, () => events.push(ProviderEvents.Error));
    
    try {
      await OpenFeature.setProviderAndWait(failingProvider, { timeout: 1000 });
    } catch {
      // expected
    }

    expect(events).toContain(ProviderEvents.Error);
  });
});

Testing Hooks

OpenFeature hooks let you add cross-cutting concerns to flag evaluation — logging, telemetry, error reporting. Hooks run at four lifecycle points: before, after, error, and finally.

class AuditHook implements Hook {
  readonly calls: string[] = [];

  before(hookContext: BeforeHookContext): void {
    this.calls.push(`before:${hookContext.flagKey}`);
  }

  after(hookContext: HookContext, details: EvaluationDetails<boolean>): void {
    this.calls.push(`after:${hookContext.flagKey}:${details.value}`);
  }

  error(hookContext: HookContext, error: unknown): void {
    this.calls.push(`error:${hookContext.flagKey}`);
  }
}

describe('Hook lifecycle', () => {
  it('calls hooks in correct order', async () => {
    const hook = new AuditHook();
    const client = OpenFeature.getClient();
    client.addHooks(hook);

    await client.getBooleanValue('dark-mode', false);

    expect(hook.calls).toEqual([
      'before:dark-mode',
      'after:dark-mode:true',
    ]);
  });

  it('calls error hook on provider failure', async () => {
    const hook = new AuditHook();
    const failingProvider = {
      metadata: { name: 'failing' },
      resolveBooleanEvaluation: () => { throw new Error('Provider error'); },
    } as unknown as Provider;

    await OpenFeature.setProviderAndWait(failingProvider);
    const client = OpenFeature.getClient();
    client.addHooks(hook);

    await client.getBooleanValue('any-flag', false);

    expect(hook.calls).toContain('error:any-flag');
  });
});

Testing Evaluation Context

Evaluation context carries user attributes that providers use for targeting. Test context propagation explicitly.

describe('Context propagation', () => {
  it('passes context to provider', async () => {
    const capturedContexts: EvaluationContext[] = [];

    const spyProvider = {
      metadata: { name: 'spy' },
      resolveBooleanEvaluation: (flagKey: string, defaultValue: boolean, context: EvaluationContext) => {
        capturedContexts.push(context);
        return { value: defaultValue };
      },
    } as unknown as Provider;

    await OpenFeature.setProviderAndWait(spyProvider);
    
    const client = OpenFeature.getClient();
    await client.getBooleanValue('my-flag', false, {
      targetingKey: 'user-123',
      plan: 'pro',
      country: 'US',
    });

    expect(capturedContexts[0]).toMatchObject({
      targetingKey: 'user-123',
      plan: 'pro',
      country: 'US',
    });
  });

  it('merges global and invocation context', async () => {
    OpenFeature.setContext({ environment: 'test' });

    const client = OpenFeature.getClient();
    // Invocation context merges with global context in provider
    await client.getBooleanValue('my-flag', false, { userId: 'user-456' });

    // Provider receives merged context
    expect(capturedContexts[0]).toMatchObject({
      environment: 'test',
      userId: 'user-456',
    });
  });
});

Testing Provider Switching

OpenFeature's key value proposition is provider portability. Test that your application works correctly after switching providers.

it('maintains correct behavior after provider switch', async () => {
  // Start with ProviderA
  await OpenFeature.setProviderAndWait(new InMemoryProvider({
    'feature-x': { defaultVariant: 'on', variants: { on: true, off: false } }
  }));

  const client = OpenFeature.getClient();
  expect(await client.getBooleanValue('feature-x', false)).toBe(true);

  // Switch to ProviderB
  await OpenFeature.setProviderAndWait(new InMemoryProvider({
    'feature-x': { defaultVariant: 'off', variants: { on: true, off: false } }
  }));

  expect(await client.getBooleanValue('feature-x', true)).toBe(false);
});

Integration with HelpMeTest

For E2E validation of OpenFeature-driven features, HelpMeTest's monitoring tests the user-visible outcome regardless of which provider is active underneath:

As a logged-in user
Navigate to /dashboard
Verify the beta analytics panel is visible
Interact with the chart
Verify data loads correctly

This test validates the full stack — OpenFeature API → provider → flag system → UI — without being coupled to vendor-specific implementation details.

CI/CD Patterns

# Test with multiple providers in CI
strategy:
  matrix:
    provider: [in-memory, flagd]
    
steps:
  - name: Start flagd (if needed)
    if: matrix.provider == 'flagd'
    run: docker run -d -p 8013:8013 ghcr.io/open-feature/flagd:latest

  - name: Run tests
    env:
      OPENFEATURE_PROVIDER: ${{ matrix.provider }}
    run: npm test

Summary

Testing OpenFeature requires: application-level tests using the in-memory provider to cover both flag states and all flag types, provider conformance tests for custom implementations, hook lifecycle tests covering before/after/error paths, explicit context propagation tests, and provider-switching tests to validate portability. The abstraction OpenFeature provides is only valuable if your tests verify that the abstraction holds — a broken provider should surface in tests, not production.

Read more