Docker Overlay Filesystem Explained: Layers, lowerdir, upperdir

Intermediate 10 min read

What you'll learn

✓What overlay2 actually is under the hood
✓How lowerdir, upperdir, and merged combine
✓How copy-on-write affects performance
✓Why small writes to big files are slow
✓How to inspect layers on a real host

Prerequisites

•Basic Linux filesystem knowledge

Every time you docker run an image, Docker mounts a stack of read-only layers and adds a thin writable layer on top. The magic that makes this feel like a normal filesystem is the kernel feature called overlayfs, exposed to Docker through the overlay2 storage driver.

What and Why

A Docker image is a list of tarball layers. Each layer adds, modifies, or deletes files relative to the layer below. When you start a container, Docker needs to present those layers as one unified filesystem the process can read and write to. It cannot literally extract every layer for every container — that would be slow and wasteful.

Overlayfs solves this by stacking directories. Reads fall through to whichever layer first contains the file. Writes happen only in a top “scratch” directory, leaving the image layers untouched and shareable across containers.

Mental Model

Three roles matter:

lowerdir: one or more read-only directories. These are your image layers, stacked bottom to top.
upperdir: a single read-write directory. This is the container’s private scratch space.
merged: the unified view the container sees as /.

Plus a fourth, workdir, which the kernel uses for atomic operations and you mostly ignore.

Reading a file: the kernel checks upperdir first, then walks down the lowerdirs until it finds the file. Writing to a file that exists in a lowerdir triggers copy-up — the entire file is copied into upperdir, then modified there.

Hands-on Example

Run a container and inspect what Docker actually mounted.

docker run -d --name demo nginx:1.27
docker inspect demo --format '{{ .GraphDriver }}'

You will see a LowerDir, UpperDir, MergedDir, and WorkDir, all under /var/lib/docker/overlay2/.

/var/lib/docker/overlay2/
<hashA>/diff   <- image layer 1 (lower)
<hashB>/diff   <- image layer 2 (lower)
<hashC>/diff   <- image layer 3 (lower)
<containerID>/diff    <- upperdir (writable)
<containerID>/work    <- workdir
<containerID>/merged  <- what container sees as /

mount -t overlay overlay \
-o lowerdir=hashC:hashB:hashA,\
   upperdir=containerID/diff,\
   workdir=containerID/work \
containerID/merged

Overlay layout for one container

Touch a new file inside the container — docker exec demo touch /tmp/new — and it appears only in the container’s upperdir. Edit an existing image file and you will see a full copy land in upperdir, leaving the original intact in the image layer.

Common Pitfalls

Writing large files inside containers and being surprised by slowness. The first write to any file that lives in a lower layer copies the whole file up, regardless of how small the change is. Logs and SQLite databases inside the writable layer are a classic trap. Use volumes for anything that gets written to repeatedly.

Running out of inodes on /var/lib/docker. Overlay creates many small files and directories. On a hot CI host you can hit the inode limit before disk fills. Watch with df -i.

Whiteout confusion. Deleting a file from a lower layer creates a whiteout — a special device node in the upperdir that hides the lower file. If you cp -a a container’s merged tree somewhere else, whiteouts can confuse the destination. Use docker export or docker commit for clean snapshots.

Assuming inode numbers are stable. Copy-up changes the inode of a file the moment you write to it. Programs that rely on inode identity (some lock managers) can misbehave.

Practical Tips

Order layers from least to most volatile in your Dockerfile. Put COPY package.json before COPY . . so dependency layers cache well and the upperdir stays small.

Use bind mounts or named volumes for any data path. Volumes bypass overlay entirely, so writes go straight to the host filesystem with no copy-up tax.

Keep an eye on docker system df -v to see how much space layers and the writable layer consume. Combine with docker image prune and docker container prune to keep /var/lib/docker healthy.

If you maintain a self-built kernel, make sure overlayfs and the metacopy feature are enabled. Metacopy lets the kernel copy only metadata for chmod and chown operations, avoiding pointless full-file copies.

Wrap-up

Overlay2 is Docker’s quiet workhorse. It makes layered images possible, keeps disk usage reasonable, and gives every container the illusion of a private root filesystem. Once you internalize the lower-upper-merged model and the copy-on-write penalty, surprising performance behavior stops being surprising and you start writing Dockerfiles that work with the storage driver instead of against it.