Test Fixtures vs Factories: Which Approach to Test Data Is Right for You?
Every test needs data. The question isn't whether to manage test data — it's how. Two approaches dominate the industry: fixtures and factories. Both work. Both have sharp edges. Choosing the wrong one at the wrong scale is one of the most common sources of test suite rot.
This guide walks through both approaches with real code, side-by-side comparisons, and a framework for deciding which to reach for in a given situation.
What Are Test Fixtures?
A fixture is static test data defined ahead of time and loaded before your tests run. The data is fixed — it doesn't change between runs. Fixtures typically live in JSON, YAML, or SQL files that your test framework loads into the database or passes to tests as constants.
The fixture approach assumes you know exactly what data your tests need and that data is stable enough to write down once and forget.
// fixtures/users.json
[
{
"id": 1,
"name": "Alice Johnson",
"email": "alice@example.com",
"role": "admin",
"created_at": "2024-01-15T10:30:00Z"
},
{
"id": 2,
"name": "Bob Smith",
"email": "bob@example.com",
"role": "user",
"created_at": "2024-02-20T14:00:00Z"
}
]# fixtures/products.yaml
- id: 1
name: Widget Pro
price: 29.99
stock: 100
category: electronics
- id: 2
name: Gadget Basic
price: 9.99
stock: 0
category: electronicsWhat Are Factories?
A factory is code that generates test data dynamically when a test requests it. Instead of writing down "Alice Johnson" in a JSON file, you write code that produces a realistic user object — different values each time, unless you seed for reproducibility.
Factories are programmable. They can build complex object graphs, apply overrides, create related records, and generate sequences that guarantee uniqueness.
# factories.py
import factory
from factory import Faker
from myapp.models import User, Product
class UserFactory(factory.django.DjangoModelFactory):
class Meta:
model = User
name = Faker("name")
email = Faker("safe_email")
role = "user"
created_at = Faker("date_time_this_year")
class ProductFactory(factory.django.DjangoModelFactory):
class Meta:
model = Product
name = Faker("catch_phrase")
price = factory.LazyFunction(lambda: round(__import__('random').uniform(1, 500), 2))
stock = factory.LazyFunction(lambda: __import__('random').randint(0, 200))
category = factory.LazyFunction(
lambda: __import__('random').choice(["electronics", "books", "clothing"])
)Side-by-Side Code Comparison
The difference becomes concrete when you see the same test written both ways.
pytest: Static Fixtures vs factory_boy
With static JSON fixtures:
# conftest.py
import json
import pytest
@pytest.fixture
def users():
with open("fixtures/users.json") as f:
return json.load(f)
@pytest.fixture
def admin_user(users):
return next(u for u in users if u["role"] == "admin")
@pytest.fixture
def regular_user(users):
return next(u for u in users if u["role"] == "user")
# test_permissions.py
def test_admin_can_delete_post(client, admin_user, auth_headers_for):
headers = auth_headers_for(admin_user)
response = client.delete("/api/posts/1", headers=headers)
assert response.status_code == 204
def test_user_cannot_delete_post(client, regular_user, auth_headers_for):
headers = auth_headers_for(regular_user)
response = client.delete("/api/posts/1", headers=headers)
assert response.status_code == 403With factory_boy:
# conftest.py
import pytest
from factories import UserFactory
@pytest.fixture
def admin_user(db):
return UserFactory(role="admin")
@pytest.fixture
def regular_user(db):
return UserFactory(role="user")
# test_permissions.py
def test_admin_can_delete_post(client, admin_user):
client.force_login(admin_user)
response = client.delete("/api/posts/1")
assert response.status_code == 204
def test_user_cannot_delete_post(client, regular_user):
client.force_login(regular_user)
response = client.delete("/api/posts/1")
assert response.status_code == 403Notice what disappears in the factory version: the JSON file, the fixture loading, the filtering to find the right user, and the auth_headers_for helper (factories create real DB records that the auth system can use directly).
Jest: Static JSON vs @faker-js/faker
With static JSON:
// __fixtures__/users.json
[
{ "id": "user-1", "name": "Alice", "email": "alice@example.com", "tier": "premium" },
{ "id": "user-2", "name": "Bob", "email": "bob@example.com", "tier": "free" }
]
// user.test.js
import users from '../__fixtures__/users.json'
describe('UserService', () => {
it('grants premium features to premium users', () => {
const premiumUser = users.find(u => u.tier === 'premium')
const features = UserService.getFeaturesFor(premiumUser)
expect(features).toContain('advanced-analytics')
})
it('restricts premium features for free users', () => {
const freeUser = users.find(u => u.tier === 'free')
const features = UserService.getFeaturesFor(freeUser)
expect(features).not.toContain('advanced-analytics')
})
})With @faker-js/faker:
import { faker } from '@faker-js/faker'
function createUser(overrides = {}) {
return {
id: faker.string.uuid(),
name: faker.person.fullName(),
email: faker.internet.email(),
tier: faker.helpers.arrayElement(['free', 'premium', 'enterprise']),
createdAt: faker.date.past().toISOString(),
...overrides,
}
}
describe('UserService', () => {
it('grants premium features to premium users', () => {
const user = createUser({ tier: 'premium' })
const features = UserService.getFeaturesFor(user)
expect(features).toContain('advanced-analytics')
})
it('restricts premium features for free users', () => {
const user = createUser({ tier: 'free' })
const features = UserService.getFeaturesFor(user)
expect(features).not.toContain('advanced-analytics')
})
it('handles users regardless of their name or email', () => {
// The factory generates random data, proving our logic doesn't depend on specific strings
const users = Array.from({ length: 10 }, () => createUser({ tier: 'premium' }))
users.forEach(user => {
expect(UserService.getFeaturesFor(user)).toContain('advanced-analytics')
})
})
})The factory version does something the fixture version cannot: the last test proves the behavior holds for any user data, not just the two names you happened to write in your JSON file.
Rails: YAML Fixtures vs FactoryBot
With YAML fixtures:
# test/fixtures/users.yml
alice:
name: Alice Johnson
email: alice@example.com
role: admin
confirmed_at: <%= Time.now %>
bob:
name: Bob Smith
email: bob@example.com
role: member
confirmed_at: <%= Time.now %># test/models/user_test.rb
class UserTest < ActiveSupport::TestCase
test "admin can manage other users" do
assert users(:alice).can_manage?(users(:bob))
end
test "member cannot manage other users" do
assert_not users(:bob).can_manage?(users(:alice))
end
endWith FactoryBot:
# spec/factories/users.rb
FactoryBot.define do
factory :user do
name { Faker::Name.full_name }
email { Faker::Internet.safe_email }
role { :member }
confirmed_at { Time.current }
trait :admin do
role { :admin }
end
trait :unconfirmed do
confirmed_at { nil }
end
end
end
# spec/models/user_spec.rb
RSpec.describe User do
describe '#can_manage?' do
it 'allows admins to manage other users' do
admin = create(:user, :admin)
member = create(:user)
expect(admin.can_manage?(member)).to be true
end
it 'prevents members from managing other users' do
member = create(:user)
other_member = create(:user)
expect(member.can_manage?(other_member)).to be false
end
it 'prevents unconfirmed users from managing anyone' do
unconfirmed_admin = create(:user, :admin, :unconfirmed)
member = create(:user)
expect(unconfirmed_admin.can_manage?(member)).to be false
end
end
endThe FactoryBot version uses traits to compose states (admin, unconfirmed) without maintaining separate fixture records for every possible combination.
The Tradeoffs Table
| Dimension | Fixtures | Factories |
|---|---|---|
| Predictability | High — same data every run | Lower without seeding |
| Setup speed | Fast — load once | Slower — creates records per test |
| Coupling risk | High — tests depend on specific IDs/values | Low — tests declare what they need |
| Cross-team visibility | Easy — non-engineers can read JSON/YAML | Lower — requires reading factory code |
| Unique data guarantees | Manual — you must ensure uniqueness in files | Built-in via sequences |
| Complex relationships | Hard — must manually maintain FK consistency | Easy — SubFactory handles it |
| Partial data variants | Hard — copy-paste whole records | Easy — pass overrides |
| CI database seeding | Native — many frameworks load fixtures automatically | Requires explicit setup |
| Debugging failures | Easy — known data, easy to reason about | Harder — what was the random value? |
When to Prefer Fixtures
You have a small, stable dataset. If your test data is a handful of records that rarely change, fixtures are the simpler choice. No code, no dependencies, just data.
You need known IDs. Some tests must reference specific IDs — integration tests that call external systems, tests that verify URL generation, tests that check rendered HTML. Fixtures give you id: 42 reliably. Factories give you whatever the database assigns.
Cross-team visibility matters. Product managers, QA engineers, and business analysts can read a YAML fixture file. They cannot read a factory definition that chains Faker providers and LazyFunctions. If non-engineers need to understand or modify test data, fixtures win.
You're testing read-only behavior. If your test only reads data and asserts on it (no writes, no state changes), fixture data loaded once for the whole suite is dramatically faster than factory-created records per test.
CI performance is critical. Rails fixture loading is famous for being fast — it truncates and reloads in a single transaction. At 10,000 records, fixtures beat factory-per-test handily.
When to Prefer Factories
You have many variants of the same entity. If you need an admin user, a suspended user, a user with no payment method, a user with an expired trial, and a user who was invited but never confirmed — that's five fixture records with mostly duplicated fields. That's one factory with five traits.
Your domain model has complex relationships. Order → OrderItems → Products → Inventory → Warehouse. Building this graph manually in fixture files is error-prone and brittle. A factory with SubFactory relationships handles it in a few lines.
You want to prove behavior is general, not specific. A test that passes with hardcoded data might fail with real user data. Factories reveal these assumptions. If your discount calculation only works when user.name == "Alice", a factory will catch it. A fixture won't.
You're adding tests to an existing codebase. Fixtures require loading the entire fixture set (or carefully scoping it). Factories let you create exactly what one test needs without touching shared state.
You need unique data guarantees. Email uniqueness constraints, slug uniqueness, SKU uniqueness — factories handle this with sequences:
class UserFactory(factory.django.DjangoModelFactory):
email = factory.Sequence(lambda n: f"user{n}@example.com")
username = factory.Sequence(lambda n: f"user_{n:04d}")Each factory call gets a unique n, guaranteeing no collisions.
Factory Sequences for Unique Data
Sequences are one of factory_boy's most important features. They replace the problem of "how do I generate unique emails" with a clean, predictable solution:
import factory
from factory import Faker, Sequence
class UserFactory(factory.django.DjangoModelFactory):
class Meta:
model = User
# Each user gets a unique sequential email
email = Sequence(lambda n: f"user{n}@testdomain.com")
# Sequence with a formatted number
username = Sequence(lambda n: f"testuser_{n:05d}")
# Sequence combined with Faker for realistic-but-unique data
name = Faker("name")
# Sequence reset between test modules (factory_boy handles this)In JavaScript, you can implement sequences manually:
let userCounter = 0
function createUser(overrides = {}) {
const n = ++userCounter
return {
id: `user-${n}`,
email: `user${n}@testdomain.com`,
username: `testuser_${String(n).padStart(5, '0')}`,
name: faker.person.fullName(),
...overrides,
}
}
beforeEach(() => {
userCounter = 0 // Reset between tests
})The Hybrid Approach: Minimal Fixtures + Domain Factories
The most practical real-world pattern isn't "fixtures vs factories" — it's a deliberate combination.
Use fixtures for:
- Authentication states (known users with known roles that your auth middleware expects)
- Reference data (countries, currencies, plan tiers — data that rarely changes)
- Data that must have specific IDs for external system integration tests
Use factories for:
- All domain object creation in unit and integration tests
- Any data that needs variants or overrides
- Data that must be unique
# conftest.py — minimal fixture layer for auth
import pytest
import json
@pytest.fixture(scope="session")
def admin_credentials():
"""Known admin user — loaded from fixture, not generated."""
return {"email": "admin@testdomain.com", "password": "test-admin-pass-123"}
@pytest.fixture(scope="session")
def reference_data(db):
"""Load stable reference data once for the whole session."""
with open("fixtures/reference_data.json") as f:
data = json.load(f)
load_reference_data(data)
return data
# All domain-specific fixtures use factories
@pytest.fixture
def customer(db):
return UserFactory(role="customer")
@pytest.fixture
def premium_customer(db):
return UserFactory(role="customer", tier="premium")
@pytest.fixture
def order_with_items(db, customer):
order = OrderFactory(user=customer)
OrderItemFactory.create_batch(3, order=order)
return orderThis hybrid gives you fast, stable auth fixtures that don't create coupling in your domain tests, and flexible factories that let your domain tests declare exactly what they need.
Fixture Seeding in CI
Fixtures shine in CI for seeding reference data that must exist before any test runs. Most frameworks support this natively:
Django:
# loaddata runs in CI before tests
python manage.py loaddata fixtures/reference_data.json
python manage.py <span class="hljs-built_in">testRails:
# Fixtures load automatically in Rails test environment
rails <span class="hljs-built_in">test
<span class="hljs-comment"># Or for RSpec with DatabaseCleaner:
<span class="hljs-comment"># Set DatabaseCleaner strategy to :transaction for specs using factories
<span class="hljs-comment"># Set to :truncation + reload fixtures for specs using fixturesNode.js (Jest with a custom global setup):
// jest.globalSetup.js
const { loadFixtures } = require('./test/helpers')
module.exports = async () => {
await loadFixtures([
'fixtures/countries.json',
'fixtures/currencies.json',
'fixtures/plans.json',
])
}The key insight: fixtures in CI are for reference data that the application needs to function. Domain data for tests is created by factories, scoped to each test, and cleaned up after. Mix the two and you get CI pipelines that are both fast and correct.
Making the Decision
Ask these questions in order:
- Does this data need to exist before any test runs? → Fixture (auth users, reference data)
- Does this test need a specific ID? → Fixture
- Does this data have more than 3 variants in my tests? → Factory
- Does this entity have relationships to other entities? → Factory
- Will non-engineers need to read or modify this data? → Fixture
- Everything else? → Factory
The majority of your test data will fall into the factory category. Fixtures serve a specific, narrow purpose: stable reference data and auth state. Factories serve everything else.
The teams that write the best test suites don't pick a side. They use fixtures as a foundation for the handful of things that need to be stable and known, then use factories for everything their tests actually exercise. The fixture layer is small and changes rarely. The factory layer is expressive and grows with the domain.
Start with factories. Add fixtures when you hit a genuine need for known, stable data. You'll find that the fixture layer stays small, the factory layer stays flexible, and your test suite stays honest about what it's actually testing.