The Test Data Problem

As test suites grow, managing test data becomes one of the biggest maintenance challenges. Consider a test suite with 500 tests, each requiring a User object. If the User model adds a new required field, you must update all 500 tests. This problem compounds with complex data models involving relationships between entities.

Hardcoded test data creates three problems: duplication (the same user data appears in hundreds of tests), brittleness (model changes break many tests), and opacity (tests are cluttered with data that is irrelevant to what they are verifying).

Test data factories and fixtures solve these problems by centralizing data creation and lifecycle management.

Test Data Factories

A test data factory is a centralized module that creates test objects with sensible defaults. Each test only specifies the fields that matter for its specific scenario, and the factory fills in everything else.

The Factory Pattern

// UserFactory.java
public class UserFactory {
    private static int counter = 0;

    public static User createDefault() {
        counter++;
        return User.builder()
            .name("Test User " + counter)
            .email("user" + counter + "@test.com")
            .role("user")
            .active(true)
            .createdAt(Instant.now())
            .build();
    }

    public static User createAdmin() {
        return createDefault().toBuilder()
            .role("admin")
            .permissions(List.of("read", "write", "delete", "manage_users"))
            .build();
    }

    public static User createInactive() {
        return createDefault().toBuilder()
            .active(false)
            .deactivatedAt(Instant.now())
            .build();
    }
}

// Usage in tests
@Test
void adminCanDeleteUsers() {
    User admin = UserFactory.createAdmin();
    User target = UserFactory.createDefault();
    // Test only cares that admin has correct role — factory handles the rest
    assertTrue(userService.canDelete(admin, target));
}

The Builder Pattern

public class UserBuilder {
    private String name = "Default User";
    private String email = "default@test.com";
    private String role = "user";
    private boolean active = true;
    private List<String> permissions = List.of("read");

    public static UserBuilder aUser() {
        return new UserBuilder();
    }

    public UserBuilder withName(String name) {
        this.name = name;
        return this;
    }

    public UserBuilder withRole(String role) {
        this.role = role;
        return this;
    }

    public UserBuilder withPermissions(String... perms) {
        this.permissions = List.of(perms);
        return this;
    }

    public UserBuilder inactive() {
        this.active = false;
        return this;
    }

    public User build() {
        return new User(name, email, role, active, permissions);
    }
}

// Usage — reads like a specification
User admin = UserBuilder.aUser()
    .withName("Alice")
    .withRole("admin")
    .withPermissions("read", "write", "delete")
    .build();

Python Example (Factory Boy)

import factory
from models import User, Order

class UserFactory(factory.Factory):
    class Meta:
        model = User

    name = factory.Faker('name')
    email = factory.LazyAttribute(lambda obj: f"{obj.name.lower().replace(' ', '.')}@test.com")
    role = "user"
    active = True

class AdminFactory(UserFactory):
    role = "admin"
    permissions = ["read", "write", "delete", "manage_users"]

class OrderFactory(factory.Factory):
    class Meta:
        model = Order

    user = factory.SubFactory(UserFactory)
    total = factory.Faker('pydecimal', left_digits=3, right_digits=2, positive=True)
    status = "pending"

# Usage
user = UserFactory()           # Default user with random name
admin = AdminFactory()         # Admin with all permissions
order = OrderFactory(status="shipped")  # Override only what matters

Test Fixtures

Fixtures manage the test lifecycle — setting up preconditions before tests and cleaning up afterward.

JUnit 5 Fixtures

@TestInstance(TestInstance.Lifecycle.PER_CLASS)
class OrderServiceTest {
    private DatabaseConnection db;
    private OrderService orderService;
    private User testUser;

    @BeforeAll
    void setupOnce() {
        db = DatabaseConnection.create("test-db");
        orderService = new OrderService(db);
    }

    @BeforeEach
    void setupEach() {
        db.beginTransaction();
        testUser = UserFactory.createDefault();
        db.insert(testUser);
    }

    @AfterEach
    void teardownEach() {
        db.rollbackTransaction();  // Clean slate for each test
    }

    @AfterAll
    void teardownOnce() {
        db.close();
    }

    @Test
    void shouldCreateOrder() {
        Order order = orderService.create(testUser, List.of(item1, item2));
        assertNotNull(order.getId());
    }
}

Pytest Fixtures

import pytest

@pytest.fixture
def db():
    connection = create_test_database()
    yield connection
    connection.cleanup()

@pytest.fixture
def user(db):
    user = UserFactory.create()
    db.save(user)
    return user

@pytest.fixture
def admin_user(db):
    admin = AdminFactory.create()
    db.save(admin)
    return admin

def test_admin_can_delete_user(admin_user, user):
    assert user_service.can_delete(admin_user, user) is True

def test_regular_user_cannot_delete(user):
    other = UserFactory.create()
    assert user_service.can_delete(user, other) is False

Playwright Fixtures

// fixtures.js
const { test as base } = require('@playwright/test');

exports.test = base.extend({
    authenticatedPage: async ({ page }, use) => {
        await page.goto('/login');
        await page.fill('#email', 'admin@test.com');
        await page.fill('#password', 'password');
        await page.click('#submit');
        await page.waitForURL('/dashboard');
        await use(page);  // Test runs here
        // Cleanup happens automatically when page closes
    },

    testUser: async ({ request }, use) => {
        const response = await request.post('/api/users', {
            data: { name: 'Test User', email: 'test@example.com' }
        });
        const user = await response.json();
        await use(user);
        await request.delete(`/api/users/${user.id}`);  // Cleanup
    }
});

Test Data Strategies

Strategy 1: Create and Destroy per Test

Each test creates its own data and cleans up after. Most isolated and reliable, but slowest.

Strategy 2: Transaction Rollback

Wrap each test in a database transaction and roll back after. Fast cleanup, but only works with database-backed tests.

Strategy 3: Shared Reference Data

Read-only reference data (countries, currencies, roles) is shared across tests and loaded once. Only mutable test data is created per test.

Strategy 4: Database Snapshots

Take a database snapshot before the test suite and restore it between tests. Fast for large datasets but complex to maintain.

Exercises

Exercise 1: Build a Factory Library

  1. Create factories for User, Product, and Order entities with sensible defaults
  2. Implement Builder pattern for User with fluent API
  3. Create factory methods for common personas: admin, premium user, suspended user
  4. Write 5 tests using your factories, each overriding only relevant fields

Exercise 2: Fixture Hierarchy

  1. Implement fixtures for database setup and teardown with transaction rollback
  2. Create a fixture that provides an authenticated API client
  3. Build a fixture composition where higher-level fixtures depend on lower-level ones
  4. Verify isolation by running tests in random order

Exercise 3: Data Strategy Comparison

  1. Implement the same 3 tests using Create/Destroy and Transaction Rollback strategies
  2. Measure execution time for each approach with 50 tests
  3. Introduce a test that corrupts shared data and verify isolation in each strategy
  4. Recommend a strategy based on your test suite’s characteristics