What Is Integration Testing?
Integration testing verifies that individual software components work correctly when combined. While unit tests prove that each function works in isolation, integration tests prove that those functions work together — that data flows correctly across module boundaries, that API contracts are honored, and that combined components produce the expected behavior.
Consider an e-commerce system where the Order Service calls the Inventory Service to check stock, then calls the Payment Service to charge the customer. Each service might pass all its unit tests individually. But what happens when the Order Service sends a request to the Inventory Service? Does the data format match? Does the Inventory Service return the response the Order Service expects? Does the error handling work when the Payment Service is down?
These questions are what integration testing answers.
Why Unit Tests Are Not Enough
A classic analogy: imagine two teams building a bridge from opposite sides of a river. Each team builds a perfect half. But when they meet in the middle, the halves do not align — different heights, different widths, different bolt patterns.
Each half was built correctly (unit tests pass), but they do not integrate (integration tests would have caught the mismatch).
In software, integration failures happen because:
- Data format mismatches: Service A sends dates as “MM/DD/YYYY” but Service B expects “YYYY-MM-DD”
- API contract violations: The endpoint signature changed but the caller was not updated
- Timing issues: Module A assumes Module B responds in under 100ms, but Module B takes 500ms
- State management: Module A expects Module B to maintain session state, but Module B is stateless
- Error handling gaps: Module A does not handle the error codes Module B actually returns
Integration Approaches
There are four main strategies for integrating and testing components. Each has distinct advantages and tradeoffs.
Tested first] --> TD_B[Stub B] TD_A --> TD_C[Stub C] TD_note["High → Low"] end subgraph "Bottom-Up" BU_B[Module B
Tested first] --> BU_Driver[Driver for A] BU_C[Module C
Tested first] --> BU_Driver BU_note["Low → High"] end subgraph "Sandwich" SW_A[Module A
Top layer] --> SW_B[Module B
Target layer] SW_C[Module C
Bottom layer] --> SW_B SW_note["Both directions"] end
Big Bang Integration
How it works: All components are developed independently, then combined and tested together at once.
Advantages:
- Simple — no stubs or drivers needed
- Convenient when the system is small
Disadvantages:
- Defect isolation is extremely difficult (any component could be the cause)
- Integration issues are discovered late
- Not practical for large systems
Best for: Small projects with few components and tight deadlines.
Top-Down Integration
How it works: Start with the highest-level module and integrate downward. Lower-level modules that are not ready are replaced with stubs.
Process:
- Test the main module with stubs replacing its dependencies
- Replace stubs one at a time with real modules
- Test after each replacement
- Continue until all modules are integrated
Advantages:
- Critical control flow is tested early
- Architecturally significant defects are found early
- You can demonstrate a working (partial) system early
Disadvantages:
- Requires writing many stubs
- Lower-level functionality is tested late
- Stubs may not accurately simulate real module behavior
Best for: Systems where the top-level architecture and control flow are most critical.
Bottom-Up Integration
How it works: Start with the lowest-level modules and integrate upward. Higher-level modules that are not ready are replaced with drivers (test programs that call the module being tested).
Process:
- Test the lowest-level modules using drivers
- Combine tested modules into larger clusters
- Replace drivers with actual higher-level modules
- Continue until the full system is integrated
Advantages:
- No stubs needed — low-level modules are tested with real functionality
- Defects in foundational components are found early
- Easier to observe test results (low-level output is concrete)
Disadvantages:
- The overall system is not visible until the final stages
- Requires writing drivers
- High-level design issues are discovered late
Best for: Systems where the foundation (data access, utilities, core logic) is most critical.
Sandwich (Hybrid) Integration
How it works: Combine Top-Down and Bottom-Up approaches. The system is divided into three layers: top, middle (target), and bottom. Top layers are integrated downward, bottom layers upward, and they meet at the target layer.
Advantages:
- Combines the benefits of both approaches
- Large systems can be tested in parallel teams
- Both high-level and low-level defects are found relatively early
Disadvantages:
- More complex to plan and coordinate
- The target (middle) layer may not be thoroughly tested
Best for: Large systems with clearly defined layers where parallel teams work on different components.
Component Integration vs System Integration
There are two distinct scopes of integration testing:
Component Integration Testing verifies interactions between components within the same system. Example: testing that the UserService correctly calls the UserRepository to save a user record. This is typically done by developers.
System Integration Testing verifies interactions between different systems or applications. Example: testing that your application correctly communicates with a third-party payment gateway. This is typically done by QA or a dedicated integration team.
| Aspect | Component Integration | System Integration |
|---|---|---|
| Scope | Within one application | Between applications/systems |
| Who | Developers | QA / Integration team |
| Environment | Development/CI | Staging / Integration environment |
| Dependencies | Internal modules | External services, APIs, databases |
| Speed | Fast (seconds) | Slower (network, external systems) |
What to Test at the Integration Level
Focus integration tests on the boundaries — the points where components communicate:
- API contracts: Does the request format match what the receiver expects?
- Data transformations: Is data correctly converted when passing between components?
- Error propagation: When Service B fails, does Service A handle it gracefully?
- Authentication/Authorization: Do security tokens flow correctly between services?
- Database interactions: Do queries return expected data? Do transactions commit correctly?
- Message queues: Are messages published and consumed in the right format?
- File system operations: Are files written and read correctly across components?
Exercise: Design Integration Tests for a Microservices System
Consider this microservices architecture for a food delivery platform:
User App → API Gateway → Order Service → Restaurant Service
→ Payment Service → Stripe API
→ Delivery Service → Maps API
→ Notification Service → Email/SMS Provider
The Order Service:
- Receives orders from the API Gateway
- Queries Restaurant Service for menu availability
- Calls Payment Service to charge the user
- Notifies Delivery Service to assign a driver
- Triggers Notification Service to send confirmation
Design integration tests for the following interactions. For each, specify: what you test, expected behavior, and error scenarios.
- Order Service ↔ Restaurant Service
- Order Service ↔ Payment Service
- Order Service ↔ Delivery Service
- API Gateway ↔ Order Service
Hint
For each interaction, think about three categories: happy path (everything works), error handling (what happens when the other service fails), and data validation (are requests and responses in the correct format).Solution
1. Order Service ↔ Restaurant Service:
- Happy path: Order Service requests menu item #42 → Restaurant Service confirms available with price $12.99 → Order Service uses correct price
- Item unavailable: Order Service requests unavailable item → Restaurant Service returns 404 → Order Service returns “Item unavailable” to user
- Restaurant closed: Order Service sends order at 3 AM → Restaurant Service returns “closed” status → Order Service prevents order
- Data validation: Verify menu item IDs, prices, and availability flags are correctly serialized/deserialized
- Timeout: Restaurant Service takes >5 seconds → Order Service returns timeout error, does not create order
2. Order Service ↔ Payment Service:
- Happy path: Order total $25.99 → Payment Service charges Stripe $25.99 → returns transaction ID → Order Service stores transaction ID
- Payment declined: Card declined → Payment Service returns decline reason → Order Service shows “Payment failed” and does not create order
- Amount mismatch test: Verify the amount sent to Payment Service exactly matches the order total (including tax, delivery fee)
- Idempotency: Same order submitted twice → Payment Service charges only once (idempotency key)
- Refund flow: Order cancelled → Order Service calls Payment refund endpoint → verify refund amount matches
3. Order Service ↔ Delivery Service:
- Happy path: Order confirmed → Delivery Service receives restaurant address + user address → assigns nearest driver → returns estimated time
- No drivers available: Delivery Service returns “no drivers” → Order Service notifies user of delay
- Address validation: Invalid delivery address → Delivery Service returns address error → Order Service asks user to correct address
- Data format: Verify GPS coordinates, addresses, and ETAs are in expected formats
4. API Gateway ↔ Order Service:
- Authentication: Request without valid JWT token → API Gateway returns 401, never reaches Order Service
- Rate limiting: 100+ requests/minute from same user → API Gateway throttles, returns 429
- Request routing: POST /orders reaches Order Service, GET /orders/{id} reaches Order Service, invalid routes return 404
- Request/Response transformation: Verify API Gateway correctly forwards headers, body, and query parameters
Integration Testing Best Practices
Use Contract Testing for Microservices
In a microservices architecture, services are often developed by different teams. Contract testing (using tools like Pact) ensures that the provider and consumer agree on the API contract without needing both services running simultaneously.
The consumer writes a contract: “I will send this request and expect this response.” The provider verifies it can fulfill that contract. If either side changes, the contract test fails.
Test Database Interactions Properly
Integration tests that involve databases should:
- Use a real database (not an in-memory fake) for realistic behavior
- Run each test in a transaction that rolls back after the test completes
- Use test-specific database schemas or containers (Docker) to avoid conflicts
- Seed necessary reference data before tests
Handle External Service Dependencies
When your system depends on external APIs (Stripe, SendGrid, Google Maps), you have three options:
- Test against sandbox/staging environments — most realistic but slow and sometimes unreliable
- Use contract tests — verify your code conforms to the API spec without calling it
- Use WireMock or similar tools — mock the external API with recorded responses
Pro Tips
Tip 1: Integration tests should be deterministic. If a test passes and fails randomly, it erodes trust. Use fixed test data, control timing, and isolate from other tests.
Tip 2: Name integration tests by the interaction, not the component. Instead of testOrderService, use test_order_service_creates_payment_when_order_confirmed. The name should tell you what interaction is being verified.
Tip 3: Keep integration test environments clean. Use database transactions, container isolation, or cleanup scripts. Leftover data from previous test runs is the number one cause of flaky integration tests.
Key Takeaways
- Integration testing verifies that components work correctly when combined
- Four approaches: Big Bang (all at once), Top-Down (high to low), Bottom-Up (low to high), Sandwich (both)
- Component integration tests interactions within one app; system integration tests between apps
- Focus tests on boundaries: API contracts, data transformations, error handling
- Contract testing is essential for microservices architectures
- Deterministic test data and clean environments prevent flaky tests