Integration Tests vs Unit Tests: The Right Mix

Intermediate 11 min read

What you'll learn

✓The honest difference between unit and integration tests
✓Why the testing pyramid needs a healthier middle layer
✓How to manage fixtures and shared setup across tests
✓When to use a real database instead of a stub
✓How to structure integration tests in pytest and vitest

Prerequisites

•A grasp of [what testing is](/blog/what-is-testing)
•Comfort with [pytest basics](/blog/pytest-basics) or [vitest basics](/blog/vitest-basics)
•Awareness of [mocks and stubs](/blog/testing-mocks-and-stubs)

The testing pyramid as it was originally drawn says you should have many unit tests, fewer integration tests, and very few end-to-end tests. That advice is broadly right, but the most common failure mode in real projects is not too few unit tests; it is too many unit tests that pass against heavily mocked dependencies while the integrated system breaks. This article argues for a thicker middle layer and shows how to build it cleanly in two ecosystems.

What actually separates the two

A unit test exercises a single function, class, or module in isolation. Anything outside that unit, including the database, the network, the clock, and the file system, is replaced by a stand-in. Unit tests are extremely fast and pinpoint failures, but they only verify that the unit obeys the contract you wrote a mock against.

An integration test exercises two or more real components together. The combinations vary. A service plus its repository plus a real database is an integration test. A function plus the real HTTP client talking to a fake server is an integration test. The thread holding all of these definitions together is that the assertions only pass if the real components agree on the contract.

The pyramid, revisited

The strict pyramid encourages teams to mock anything that crosses a process boundary. The unintended consequence is that the most error-prone seams in the application, the ones between modules and external systems, are exactly the ones that have no real test coverage. Teams discover the mismatch only in staging or production.

A healthier model is sometimes called the testing trophy: a base of static analysis and types, a generous belt of unit tests for pure logic, a hefty layer of integration tests for everything that touches a boundary, and a small cap of end-to-end tests for the most critical user journeys.

Fixtures and shared setup

Integration tests are heavier than unit tests, so investing in fixture design pays off quickly.

In pytest, fixtures are functions decorated with @pytest.fixture. Their scope controls how often they run.

import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

from app.db import Base
from app.repositories import UserRepository

@pytest.fixture(scope="session")
def engine():
    eng = create_engine("postgresql://test@localhost/testdb")
    Base.metadata.create_all(eng)
    yield eng
    Base.metadata.drop_all(eng)

@pytest.fixture()
def db_session(engine):
    Session = sessionmaker(bind=engine)
    session = Session()
    session.begin_nested()
    yield session
    session.rollback()
    session.close()

def test_create_and_fetch_user(db_session):
    repo = UserRepository(db_session)
    user = repo.create(email="a@example.com")
    fetched = repo.get(user.id)
    assert fetched.email == "a@example.com"

The session-scoped engine fixture creates the schema once per test run. The function-scoped db_session wraps each test in a nested transaction that gets rolled back at the end, so every test sees a clean state without paying for a full schema rebuild.

In vitest, the equivalents are beforeAll and beforeEach along with module-level helpers.

import { afterAll, beforeAll, beforeEach, describe, expect, it } from "vitest";
import { Pool } from "pg";
import { UserRepository } from "../src/repositories/userRepository";

const pool = new Pool({ connectionString: process.env.TEST_DB_URL });

beforeAll(async () => {
  await pool.query(`CREATE TABLE IF NOT EXISTS users (
    id serial PRIMARY KEY,
    email text NOT NULL UNIQUE
  )`);
});

afterAll(async () => {
  await pool.end();
});

beforeEach(async () => {
  await pool.query("TRUNCATE users RESTART IDENTITY");
});

describe("UserRepository", () => {
  it("creates and fetches a user", async () => {
    const repo = new UserRepository(pool);
    const created = await repo.create("a@example.com");
    const fetched = await repo.findById(created.id);
    expect(fetched?.email).toBe("a@example.com");
  });
});

Real databases versus stubs

The hardest decision in integration testing is whether to use a real database or a stand-in. Each choice has real trade-offs.

A real database, ideally the same engine you run in production, gives the highest signal. The queries you ship are the queries that pass. Migrations, indexes, constraints, transactions, and SQL syntax all get exercised. The cost is slower tests and a more involved local setup.

The cheap alternative is an in-memory SQL engine like SQLite. It is fast to spin up but its behaviour diverges from Postgres in ways that matter, including type coercion, transaction isolation, and full text features. Use SQLite only when you can guarantee your application sticks to the common subset.

The middle ground is containerised databases via Docker Compose or testcontainers. You get the production engine with reasonable startup times and clean isolation between runs.

Mocks are appropriate for unit tests of the layer above the repository, but they should not replace integration tests of the repository itself.

Choosing what to integration test

Aim integration coverage at the seams of your system. The places that earn the most attention are:

Boundary code that translates between your domain and an external schema, including HTTP handlers, gRPC services, and database repositories.
Anywhere a contract is enforced by something other than the type system, such as SQL constraints, authorisation policies, and serialisation formats.
Cross-module workflows where two components must agree on an event payload or a queue message format.

For pure functions, stick with unit tests. Integration coverage for a string formatter is wasted compute.

Keeping integration tests fast enough

Slow tests get skipped. A few habits keep an integration suite usable.

Parallelise. Both pytest with pytest-xdist and vitest’s default runner support parallel workers; design your fixtures so that nothing shared lives outside a per-worker scope.

Roll back rather than reset. Wrapping each test in a transaction and rolling it back is much faster than truncating tables or dropping the schema.

Tag the slow ones. In pytest, mark expensive tests with @pytest.mark.slow and run them only in CI. In vitest, place them under a *.slow.test.ts glob and gate the run by file pattern.

Mock the genuinely external. Real HTTP calls to third-party APIs do not belong in your integration suite. Use a recorded fake or a contract test instead.

Wrap up

Unit tests verify your logic in isolation. Integration tests verify that the pieces actually fit together. A modern test suite leans more heavily on the integration layer than the original pyramid suggests, because that is where the real seams of the system live. Build careful fixtures, use a real database when you can, and reserve mocks and stubs for the units that genuinely benefit from them.