Kubernetes Stateful vs Stateless Workloads

Intermediate 10 min read

What you'll learn

✓What stateless really means
✓Why StatefulSets exist
✓Identity and ordering guarantees
✓Storage with PVCs
✓Operating databases on Kubernetes

Prerequisites

•Familiar with terminals and YAML

What and Why

A workload is stateless if any replica can serve any request and replacing a replica costs nothing. It is stateful if a replica has identity, on-disk data, or position in a cluster that must survive restarts. Kubernetes treats these very differently: Deployment for stateless, StatefulSet for stateful.

The decision matters because it changes how you scale, deploy, back up, and recover. Treat a database like a stateless service and you will eventually lose data; treat a web tier like a StatefulSet and you give up the easy rollouts that make Kubernetes pleasant.

Mental Model

Deployment (stateless)            StatefulSet (stateful)
web-7d5-abc                        db-0
web-7d5-def                        db-1
web-7d5-xyz                        db-2

random pod names                  stable name + DNS
shared (or no) storage            one PVC per pod
parallel updates                  ordered, one at a time
any replica = any other           replica identity matters

Two shapes of workload

Key StatefulSet promises: each pod gets a predictable name (db-0, db-1), a stable DNS record via a headless service, and its own PersistentVolumeClaim that follows the pod across reschedules.

Hands-on Example

A stateless API as a Deployment is the easy case. Here is the more interesting one: a 3-node Postgres-style cluster.

apiVersion: v1
kind: Service
metadata: { name: db }
spec:
  clusterIP: None         # headless: gives each pod a DNS record
  selector: { app: db }
  ports: [{ port: 5432 }]
---
apiVersion: apps/v1
kind: StatefulSet
metadata: { name: db }
spec:
  serviceName: db
  replicas: 3
  selector: { matchLabels: { app: db } }
  template:
    metadata: { labels: { app: db } }
    spec:
      containers:
        - name: pg
          image: postgres:16
          env:
            - { name: POSTGRES_PASSWORD, valueFrom: { secretKeyRef: { name: pg, key: pw } } }
          volumeMounts:
            - { name: data, mountPath: /var/lib/postgresql/data }
  volumeClaimTemplates:
    - metadata: { name: data }
      spec:
        accessModes: [ReadWriteOnce]
        storageClassName: gp3
        resources: { requests: { storage: 100Gi } }

After apply, each pod has a stable identity:

kubectl get pods -l app=db
# db-0  db-1  db-2

nslookup db-0.db.default.svc.cluster.local
# resolves to db-0's pod IP

The application configures its primary at db-0.db and replicas at the other names. When a pod restarts, it gets the same PVC reattached.

Common Pitfalls

Using a StatefulSet for a stateless service. You give up parallel rollouts and surge upgrades for no benefit.
Assuming PVC deletion follows pod deletion. Deleting a StatefulSet leaves PVCs behind on purpose. Clean them up explicitly when retiring.
Same storage class everywhere. A general-purpose class is fine for logs but slow for a database. Use io1/io2/gp3 with provisioned IOPS for write-heavy DBs.
Skipping pod disruption budgets. A drain can take down a quorum if you do not set a PDB allowing only one pod at a time.
Treating a Deployment of a single replica as stateful. It is not; the pod can be replaced anywhere. Use a StatefulSet if identity matters.

Production Tips

For databases, prefer operators: CloudNativePG, Zalando Postgres operator, MongoDB Community Operator. They encode failover, backups, and version upgrades you would otherwise script.
Use PodDisruptionBudgets with maxUnavailable: 1 for quorum systems (Kafka, etcd, Postgres replicas).
Keep backups outside the cluster. Velero for cluster state and the database’s native dump or WAL shipping to S3 for data.
Run stateful workloads on dedicated node pools with taints. Mixing a noisy app on a database node ruins p99.
Use podAntiAffinity so replicas land on different nodes and AZs - otherwise a single node failure takes the cluster down.

Production summary, and Wrap-up

Stateless workloads are Kubernetes at its happiest: scale up, scale down, roll out, nobody notices. Stateful workloads need more thought - stable identity via StatefulSets, per-pod storage, ordered rollouts, and operators for the hard parts. Pick the right primitive deliberately, and the cluster will treat each workload with the care it deserves.