Mutation Testing Tutorial
What mutation testing is, why coverage metrics lie, and how to use tools like Stryker to measure how well your tests actually catch bugs.
What you'll learn
- ✓Why line coverage is a weak signal
- ✓How mutation testing works
- ✓How to read a mutation score
- ✓How to run Stryker on a JavaScript project
- ✓When mutation testing is worth the cost
Prerequisites
- •Some unit testing experience
Line coverage tells you which lines of code were executed by your tests. It does not tell you whether your tests would catch a bug in those lines. Mutation testing fills that gap by deliberately breaking your code and checking that your tests notice. This tutorial walks through the idea and a working setup.
What and Why
Mutation testing works by making small, automated changes to your source code, running the tests against the modified version, and recording whether any test fails. Each change is called a mutant. If a test fails, the mutant is killed, which is good. If all tests pass on the broken code, the mutant survives, which means your tests did not catch that change.
The reason this matters is that coverage metrics are easy to game. A test that calls a function but does not assert anything contributes to line coverage and catches nothing. Mutation testing rewards meaningful assertions and exposes shallow tests for what they are.
Mental Model
Think of your test suite as an alarm system and your code as a building. Coverage tells you which rooms have sensors. Mutation testing sends a controlled intruder into each room and checks whether the alarm actually goes off. A sensor that does not respond to the intruder is no better than no sensor at all.
The output is a mutation score, the percentage of mutants that were killed. A score of 100 means every mutation triggered at least one test failure. A score of 60 means almost half of your mutations slipped through, which is your real test quality.
Hands-on Example
For JavaScript and TypeScript, Stryker is the standard tool. Install and configure it for a project that already has unit tests.
npm install -D @stryker-mutator/core @stryker-mutator/vitest-runner
npx stryker init
A minimal stryker.conf.json:
{
"testRunner": "vitest",
"mutate": ["src/**/*.ts"],
"reporters": ["progress", "html"],
"thresholds": { "high": 80, "low": 60, "break": 50 }
}
Run with npx stryker run. Stryker copies your project to a sandbox, generates mutants by applying operators (changing + to -, flipping conditions, removing statements), and runs your tests against each one. The HTML report shows surviving mutants line by line.
source code + tests
|
v
[mutator] -> generate mutant 1, mutant 2, ... mutant N
|
v
for each mutant:
apply change -> run tests
fails? -> killed (good)
passes? -> survived (bad)
|
v
mutation score = killed / total A surviving mutant is an invitation: write or strengthen a test until the mutant dies.
Common Pitfalls
The first pitfall is running mutation testing on a huge codebase from a cold start. The run can take hours and the report is overwhelming. Start with a single critical module, fix what surfaces, then expand.
The second is chasing 100 percent. Some mutants are equivalent, meaning the changed code behaves identically to the original. They cannot be killed and exhausting yourself trying is a waste of time. Aim for a high score, not a perfect one.
The third is letting mutation runs block CI on every commit. They are too slow. Run them nightly or on a label, not on every push.
Practical Tips
Mutate the code that matters: pricing logic, authorization checks, data transformations. Mutating logging utilities or string formatting is rarely informative.
Use the killed-mutant report as a code review prompt. If reviewers see a freshly added module with surviving mutants, they have something concrete to push back on.
Pair mutation testing with property-based tests for hot paths. The two techniques attack the suite from different angles and tend to find different gaps.
Wrap-up
Mutation testing is not a replacement for coverage. It is the second layer that asks whether your tests do their job. Run it on the parts of your code where bugs would hurt the most, take the surviving mutants seriously, and your suite gets sharper with every iteration.
Related articles
- Testing Test Coverage Metrics and Their Pitfalls
Line, branch, and mutation coverage explained. Learn what each metric tells you, what it hides, and how to use coverage without gaming it.
- Testing Property-Based Testing: An Introduction
Stop writing one example per test. Property-based testing generates inputs for you and finds the edge cases you would never think to write.
- Testing Contract Tests Explained: Catching Integration Bugs Early
Understand consumer-driven contract testing, how it differs from integration tests, and how tools like Pact prevent breaking API changes between services.
- Testing End-to-End Testing with Playwright: A Practical Tutorial
Learn how to write reliable end-to-end tests with Playwright, including selectors, fixtures, auto-waiting, and patterns that avoid flakiness.