Mocks, Stubs, and Fakes in Tests

Intermediate 12 min read

What you'll learn

✓The difference between mocks, stubs, fakes, and spies
✓When mocking helps — and when it ruins your test suite
✓The seam principle: mock at boundaries, not internals
✓How to mock in Python with unittest.mock
✓How to mock in JavaScript with vi.mock (Vitest) or jest.mock

Prerequisites

•pytest basics or any unit-testing framework you already use
•Comfort writing functions and importing modules

The word “mock” has been stretched until it covers anything that pretends to be something else in a test. That blurriness causes real problems: people mock when they should stub, stub when they should fake, and over-mock until their tests fail every time the code changes shape.

This post sorts out the vocabulary, then gets practical: where to actually use these things, where not to, and what it looks like in Python and JavaScript.

The vocabulary

Gerard Meszaros’ xUnit book gave us the canonical names. They all describe test doubles — stand-in objects used in place of the real thing.

Dummy. An object passed around but never used. Fills a parameter list.
Stub. Returns canned values. “If you call get_user(1), return this fake user.”
Fake. A working but simplified implementation. An in-memory database that stores rows in a dict.
Spy. A real (or stubbed) object that also records how it was called.
Mock. A stand-in pre-programmed with expectations about how it will be called. Fails the test if the calls don’t match.

In casual conversation people say “mock” for all of them. In code, the distinction that matters most often is stub vs mock:

A stub lets the test pass canned data into the code under test. The test asserts on the output.
A mock asserts on interactions — “was this method called, with these arguments, this many times?”

Stubs test outputs. Mocks test side effects. Choosing wrong is the most common source of brittle tests.

When to mock

The rule that matters: mock at the boundary.

Boundaries are anywhere your code talks to something you don’t own or don’t want to invoke during tests:

HTTP APIs (third-party services)
Databases (sometimes — fakes are often better)
Filesystem
Time (datetime.now())
Randomness
Email, SMS, payment providers

Inside a single module, with pure functions calling pure functions, there’s nothing to mock. Just call them. Tests that mock internal collaborators are coupled to your code’s current shape — they break on refactors that don’t change behaviour.

When not to mock

A non-exhaustive list of “you probably shouldn’t mock this”:

Your own pure functions. Just call them. They’re fast.
Standard library data structures. Don’t mock a list.
The thing you’re actually testing. Surprisingly common — people mock the function under test and wonder why the test passes regardless of behaviour.
Every collaborator. If every line of your test is when(x).thenReturn(...), the test isn’t testing logic — it’s restating the implementation.

A test suite where every test is 90% setup and 10% assertion is a sign of over-mocking. The fix is usually structural: pull side effects to the edges and make the core logic pure (the functional core, imperative shell pattern).

The seam principle

A seam is a place where you can change behaviour without editing the code on either side. Good seams make mocking easy. Bad seams force you to mock things you shouldn’t have to.

The simplest seam is a function parameter:

# Hard to test — sends a real email
def confirm_signup(user):
    send_email(user.email, "Welcome!")

# Easy to test — pass the sender in
def confirm_signup(user, send=send_email):
    send(user.email, "Welcome!")

Now the test can pass a fake send and assert it was called correctly. No patching, no monkey-patching, no mock library required.

Dependency injection — passing collaborators in instead of importing them — is the single biggest leverage point for testable code. It also makes the seams obvious.

Mocking in Python

The standard tool is unittest.mock, included with the Python standard library.

A stub returning a canned value:

from unittest.mock import MagicMock

fake_db = MagicMock()
fake_db.get_user.return_value = {"id": 1, "name": "Ada"}

# In code under test:
user = fake_db.get_user(1)
assert user["name"] == "Ada"

MagicMock will happily return mock objects for any attribute or method you call on it. That’s convenient and a footgun — typos silently “work.”

A mock asserting interactions:

fake_email = MagicMock()
confirm_signup(user, send=fake_email)

fake_email.assert_called_once_with("ada@example.com", "Welcome!")

assert_called_once_with fails the test unless the mock was called exactly once with exactly those arguments.

Patching imports

When you can’t (or don’t want to) inject a dependency, patch replaces an imported name inside a module:

from unittest.mock import patch

with patch("myapp.signup.send_email") as fake:
    confirm_signup(user)
    fake.assert_called_once()

Crucial detail: patch the name where it’s used, not where it’s defined. If signup.py does from emailer import send_email, you patch myapp.signup.send_email, not myapp.emailer.send_email. This trips up everyone at least once.

Patching with pytest

Combined with pytest, monkeypatch is the idiomatic version:

def test_signup_sends_email(monkeypatch):
    calls = []
    monkeypatch.setattr(
        "myapp.signup.send_email",
        lambda to, body: calls.append((to, body)),
    )

    confirm_signup(user)

    assert calls == [("ada@example.com", "Welcome!")]

This is a hand-rolled spy. Often clearer than reaching for MagicMock.

Mocking in JavaScript

Vitest (which mirrors Jest’s API) is the modern default. vi.mock replaces a module; vi.fn creates a mock function.

A stub:

import { vi, test, expect } from 'vitest';
import { confirmSignup } from './signup';

vi.mock('./db', () => ({
  getUser: vi.fn().mockReturnValue({ id: 1, name: 'Ada' }),
}));

test('confirms a known user', () => {
  const result = confirmSignup(1);
  expect(result.name).toBe('Ada');
});

A mock asserting interactions:

import { vi, test, expect } from 'vitest';
import { confirmSignup } from './signup';

const sendEmail = vi.fn();

test('sends welcome email', () => {
  confirmSignup({ email: 'ada@example.com' }, sendEmail);
  expect(sendEmail).toHaveBeenCalledTimes(1);
  expect(sendEmail).toHaveBeenCalledWith('ada@example.com', 'Welcome!');
});

Same pattern as Python: a stub feeds data in; a mock checks that a call happened.

Mocking time and randomness

Two boundaries every codebase eventually needs to mock.

import { vi, beforeEach, afterEach, test, expect } from 'vitest';

beforeEach(() => vi.useFakeTimers());
afterEach(() => vi.useRealTimers());

test('expires after 24h', () => {
  vi.setSystemTime(new Date('2026-01-01T00:00:00Z'));
  const token = makeToken();
  vi.advanceTimersByTime(25 * 60 * 60 * 1000);
  expect(isExpired(token)).toBe(true);
});

In Python:

from freezegun import freeze_time

@freeze_time("2026-01-01")
def test_expiry():
    ...

If you find yourself sprinkling time.sleep(1) or “flaky” markers, mock time instead.

Try it yourself. Take a function you’ve written that calls datetime.now() or Math.random() directly. Refactor it so the time/random source is a parameter with a default. Write a test that passes a fixed value. You’ve just made an untestable function testable, without touching any mock library. That’s the seam principle in action.

Fakes vs mocks for databases

A common dilemma. Should tests use a mocked database client, or a fake in-memory implementation?

Mocks are fast and isolated, but every test re-encodes assumptions about the database’s API. A schema change can break dozens of tests that didn’t actually exercise the broken behaviour.
Fakes (an in-memory dict that pretends to be your DAL, or SQLite in place of Postgres) read closer to real code. Tests survive refactors better. The downside is you need to write or pick the fake.
The real database in a Docker container is the gold standard for integration tests. Slower, but truest. Many teams keep unit tests fast with fakes and run a smaller integration suite against a real DB.

The decision tree: if you only need to verify “did my code call the DB with the right query?”, a mock works. If you need to verify “does my code’s logic actually produce the right result?”, a fake or real DB is better.

Over-mocking: the smell

Signs your test suite has gone too far:

Tests fail when you refactor without changing behaviour.
The mock setup is longer than the production function.
Tests pass even when you break the production code (because the mocks just confirm calls they were told to expect).
You catch bugs in production that no test caught, despite high “coverage.”

The underlying problem is usually that tests are validating implementation details — exact calls in exact orders — rather than behaviour. The fix is to push tests up a level: test through a public interface, mock only the outermost boundaries, and let internal collaboration emerge.

A short checklist

Before writing a mock, ask:

Is this a real boundary? (Network, disk, time, randomness, external service.) If yes, mock or fake it.
Could I inject this instead of patching? Injection is almost always cleaner.
Am I testing output or interaction? Output → stub. Interaction → mock.
Would a fake serve me better than a mock? Often yes for stateful collaborators like databases or caches.
Will this test survive a reasonable refactor? If not, rethink it.

Recap

You now know:

Stubs return canned data; mocks assert on interactions; fakes are working simplified implementations; spies record calls
Mock at boundaries — network, disk, time, external services — not at internal seams
Inject dependencies to create seams without patching libraries
unittest.mock and monkeypatch in Python; vi.mock / jest.mock in JS
Over-mocking produces tests that pass while the system breaks; aim to test behaviour, not implementation

Next steps

Pair this with a habit of keeping your business logic pure and your I/O at the edges. Tests get faster, simpler, and more durable.

Questions or feedback? Email codeloomdevv@gmail.com.