Node Streams and Backpressure Explained
How Node.js streams really work, why backpressure matters, and how to compose readable, writable, and transform streams without blowing up memory.
What you'll learn
- ✓What a stream really is in Node
- ✓How push and pull data flow differ
- ✓Why backpressure prevents memory blowups
- ✓How to use pipeline and async iteration
- ✓Common stream pitfalls and fixes
Prerequisites
- •Comfortable with JavaScript and async I/O
If you have ever loaded a 5 GB file with fs.readFile and watched your process die, you already understand why streams exist. Streams let Node move data in chunks so memory stays flat regardless of input size. The catch is that streams only work well when producers and consumers stay in sync, and that synchronization is called backpressure.
What a stream actually is
A stream is an EventEmitter with a buffer and a contract. Readable streams produce data, writable streams consume it, duplex streams do both, and transform streams sit in the middle and mutate chunks as they pass through. The buffer has a highWaterMark, which is just a soft limit (16 KB by default for byte streams, 16 objects for object mode).
When the buffer fills, the stream tells you to slow down. When it drains, it tells you to resume. That signal is backpressure. Without it, a fast producer can flood a slow consumer and Node will happily queue gigabytes in memory until the process crashes.
Mental model
Producer --chunk--> [buffer | highWaterMark] --chunk--> Consumer
^ |
| |
write() returns false drain event
| |
+--- pause production -------+ The flow is: writable returns false when the buffer crosses the threshold, the producer pauses, and the writable emits drain when it is ready for more.
Hands-on: copy a file with proper backpressure
The naive version looks fine but ignores write()’s return value.
// BAD: no backpressure
const fs = require('node:fs');
const src = fs.createReadStream('big.log');
const dst = fs.createWriteStream('copy.log');
src.on('data', (chunk) => {
dst.write(chunk); // ignores return value
});
src.on('end', () => dst.end());
If reads outpace writes (very common when writing across disks or to network), the writable buffer grows without bound. The fix is pipeline, which handles backpressure, errors, and cleanup for you.
const { pipeline } = require('node:stream/promises');
const fs = require('node:fs');
await pipeline(
fs.createReadStream('big.log'),
fs.createWriteStream('copy.log'),
);
For transforms, drop one in the middle. Here is a line-counting transform.
const { Transform } = require('node:stream');
const countLines = () => {
let count = 0;
return new Transform({
transform(chunk, _enc, cb) {
count += chunk.toString().split('\n').length - 1;
cb(null, chunk);
},
flush(cb) {
console.error(`lines: ${count}`);
cb();
},
});
};
await pipeline(
fs.createReadStream('big.log'),
countLines(),
fs.createWriteStream('copy.log'),
);
Async iteration: the modern way
Readable streams are async iterables. This often replaces hand-written event handlers and gets backpressure for free, because the loop body awaits each chunk.
import { createReadStream } from 'node:fs';
for await (const chunk of createReadStream('big.log', { encoding: 'utf8' })) {
await process(chunk); // awaiting pauses the stream
}
Inside that for await, the readable pauses while your handler awaits. No manual pause/resume. Just keep in mind that throwing inside the loop must be caught or the stream is destroyed mid-flight.
Common pitfalls
- Ignoring
write()’s return value. If it returnsfalse, stop writing untildrainfires. - Using
pipewithout error handlers.pipedoes not forward errors to the destination; usepipelineinstead. - Mixing
datalisteners andpipe. Attachingdataswitches the stream into flowing mode and may steal chunks from the pipe. - Forgetting object mode. If you push objects, set
objectMode: trueon both ends, otherwise Node coerces to strings or buffers. - Not calling
cb()in transforms. The stream will silently stall. - Setting
highWaterMarkto giant values to “fix” slowness. It only delays the symptom and hides real bottlenecks.
Practical tips
- Default to
pipeline(the promise version). It handles cleanup on error, including destroying earlier streams. - For HTTP, you already have streams.
reqis readable,resis writable. Pipe through gzip withzlib.createGzip(). - Measure with
process.memoryUsage(). If RSS climbs as input size grows, you are buffering somewhere you should not be. - Prefer async iteration for one-off scripts and
pipelinefor production wiring. - When wrapping a third-party producer, use
Readable.from(asyncIterable)rather than rolling your own_read.
Wrap-up
Streams are not just an optimization. They are how Node stays small while moving large data. Once you internalize the backpressure contract, pipeline becomes the obvious default, transforms become Lego bricks, and memory profiles stay flat no matter how big the input grows. Reach for buffers only when the data legitimately fits and the simplicity is worth it.
Related articles
- Node.js Node.js Async Iterators Tutorial
Master async iterators in Node.js for streaming files, paginated APIs, and backpressure-aware data processing.
- Node.js Debugging Node.js Memory Leaks
Find and fix memory leaks in Node.js using heap snapshots, sampling, and a few reliable patterns to avoid leaks.
- Node.js Node Cluster Mode for Multi-Core Scaling
Learn how the Node.js cluster module forks worker processes to use every CPU core and how to share sockets between them safely.
- Node.js Node EventEmitter Patterns
EventEmitter is the backbone of Node. Here are the patterns that make it useful in real systems and the mistakes that turn it into a footgun.