Load Testing with k6
A practical introduction to k6 for load testing HTTP services. Covers scripting, stages, thresholds, and how to read the results without fooling yourself.
What you'll learn
- ✓What load testing actually measures
- ✓How k6 scripts are structured
- ✓How to model realistic traffic with stages
- ✓What thresholds give you
- ✓How to avoid common measurement mistakes
Prerequisites
- •Basic JavaScript
- •Familiar with HTTP APIs
Load testing answers the question of how your service behaves under realistic and unrealistic traffic. k6 is a modern, scriptable tool that makes it cheap to ask that question repeatedly. This post walks through the basics and the gotchas that catch teams the first time.
What and Why
Load testing is the practice of sending controlled traffic to a service and measuring how it responds. The goals are usually some mix of finding the capacity ceiling, confirming a service stays within an SLO under expected load, and exposing memory leaks or saturation points before they happen in production.
k6 is a load testing tool written in Go with a JavaScript scripting layer. You write a small script that defines virtual users and their behaviour, then run it locally or in cloud workers. Results come back as time series of request rates, response times, and custom metrics you define.
Mental Model
Picture a control panel with two knobs. One knob is the number of virtual users, which is how many parallel clients are hitting your service. The other is the duration of each phase. You ramp up, hold, ramp down, and watch the dials. Some dials are about the load generator (requests sent, errors received). Some are about the system under test (latency percentiles, error rate). Thresholds are pass/fail rules you attach to those dials.
The trap is to confuse virtual users with real users. A virtual user is just a script loop. Whether that maps to a real user depends on how realistic your script is.
Hands-on Example
A starter script that hits an API and checks the response.
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: [
{ duration: '30s', target: 20 },
{ duration: '1m', target: 20 },
{ duration: '30s', target: 0 },
],
thresholds: {
http_req_failed: ['rate<0.01'],
http_req_duration: ['p(95)<300'],
},
};
export default function () {
const res = http.get('https://api.example.com/products');
check(res, { 'status is 200': (r) => r.status === 200 });
sleep(1);
}
This ramps to 20 virtual users over 30 seconds, holds for a minute, and ramps down. It fails the run if more than 1 percent of requests fail or if the 95th percentile latency exceeds 300 milliseconds.
stages: ramp up -> hold -> ramp down
[k6 VUs] --requests--> [your service] --responses--> [k6]
|
v
metrics: rps, p95, errors
|
v
thresholds: pass / fail Run it with k6 run script.js and stream results to a time-series database like InfluxDB if you want to inspect them later.
Common Pitfalls
The first pitfall is testing from a single machine over the public internet. The bottleneck becomes your laptop’s CPU or the bandwidth between you and the service, not the service itself. Run k6 close to the service or use a distributed runner.
The second is reusing a single token or single resource ID across all virtual users. Caches and database hot rows make the service look faster than it is. Generate varied inputs.
The third is reading averages instead of percentiles. A mean of 80 milliseconds with a p99 of 6 seconds is a system on fire. Always look at p95 and p99 alongside the mean.
Practical Tips
Treat the load test as code. Commit scripts, review them, and run them in CI against a staging environment on every release candidate.
Tag requests with names so the report breaks down by endpoint instead of treating every URL as one bucket. Useful when a single script exercises several routes.
Warm up. A cold cache and a cold JIT make the first thirty seconds of every run misleading. Either discard that window or ramp up slowly.
Wrap-up
k6 lowers the cost of asking how your service behaves under load. The hard part is not the tool. It is being honest about whether your script and environment mirror reality well enough for the numbers to mean anything.
Related articles
- Testing Property-Based Testing: An Introduction
Stop writing one example per test. Property-based testing generates inputs for you and finds the edge cases you would never think to write.
- Testing Contract Tests Explained: Catching Integration Bugs Early
Understand consumer-driven contract testing, how it differs from integration tests, and how tools like Pact prevent breaking API changes between services.
- Testing Test Coverage Metrics and Their Pitfalls
Line, branch, and mutation coverage explained. Learn what each metric tells you, what it hides, and how to use coverage without gaming it.
- Testing End-to-End Testing with Playwright: A Practical Tutorial
Learn how to write reliable end-to-end tests with Playwright, including selectors, fixtures, auto-waiting, and patterns that avoid flakiness.