Kubernetes Pod Disruption Budgets Explained

Intermediate 9 min read

What you'll learn

✓What voluntary vs involuntary disruption means
✓How PDBs interact with kubectl drain and cluster autoscaler
✓How to size minAvailable vs maxUnavailable
✓When a PDB can deadlock a node drain

Prerequisites

•Familiarity with Deployments and replicas

What and Why

Nodes get drained for upgrades, autoscaling, and spot reclamation. Without a guardrail, the eviction API would happily kill every replica of your service at once if they all landed on the same node group. A PodDisruptionBudget (PDB) tells the eviction API how many pods of a given selector are allowed to be unavailable at the same time during a voluntary disruption.

PDBs only protect against voluntary disruptions: drains, autoscaler evictions, and any client of the eviction API. They do nothing for involuntary events like a kernel panic, an instance being yanked away, or a pod OOM-kill. For those you need replica counts and topology spread.

Mental Model

A PDB is a contract: at least minAvailable pods must stay Ready, or no more than maxUnavailable may be unavailable. The eviction API checks the budget before allowing each call. If allowing the eviction would breach the budget, the call returns 429 and the drainer retries with backoff.

Pick exactly one of minAvailable or maxUnavailable. They can be absolute integers or percentages of the selected pods.

Hands-on Example

A web service with three replicas that must keep at least two serving traffic:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
spec:
  replicas: 3
  selector:
    matchLabels: { app: web }
  template:
    metadata:
      labels: { app: web }
    spec:
      containers:
        - name: web
          image: example/web:1.0
          ports: [{ containerPort: 8080 }]
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels: { app: web }

Now drain a node that hosts two of the three pods:

kubectl drain node-3 --ignore-daemonsets --delete-emptydir-data

Initial: web-a (node-1), web-b (node-3), web-c (node-3)
PDB: minAvailable=2  -> 1 pod may be unavailable

drain node-3:
evict web-b -> allowed (2 of 3 still up)
evict web-c -> DENIED (would leave only 1 up)
...
web-b reschedules on node-2 and becomes Ready
evict web-c -> allowed
drain completes

Eviction blocked until rescheduling restores budget

Common Pitfalls

minAvailable: 100% on a single-replica Deployment is a guaranteed deadlock. The eviction API can never satisfy “all pods stay Ready” while evicting one of them. Drain will retry forever or time out. Always scale to at least two replicas before setting strict PDBs.

PDBs use the same selector vocabulary as Deployments, but they target pods directly. If two Deployments share a label and one PDB selects both, the budget applies to the union. Use selectors precise enough to avoid this.

A pod that is failing its readiness probe already counts as unavailable. If your fleet is partially unhealthy when a drain starts, the drain may stall immediately because the budget is already at its limit.

status.disruptionsAllowed on the PDB is the field to watch. When it is 0, no further evictions are possible until pods become Ready again.

Production Tips

For stateless web tiers, maxUnavailable: 1 scales naturally as you add replicas and avoids the trap of percentage math on small fleets. For quorum systems like etcd or Zookeeper, set minAvailable to your quorum count (for a 3-node cluster, minAvailable: 2).

Pair PDBs with topologySpreadConstraints so replicas land on different nodes and zones to begin with. A PDB cannot save you if all three replicas were already on the node being drained — eviction will block until a replacement schedules, which can be slow if no other nodes have capacity.

For node-group upgrades, set a sensible terminationGracePeriodSeconds on pods and watch for pod-deletion-cost annotations to influence which replica is evicted first.

Test PDBs in staging by running kubectl drain on a real node. A PDB that has never been exercised is a PDB you should not trust.

Wrap-up

PDBs are the difference between a routine node upgrade and a five-minute outage. Define one for every Deployment that handles user traffic, keep at least two replicas, spread them across failure domains, and watch disruptionsAllowed in your dashboards. The eviction API quietly does the rest.