Skip to content
C Codeloom
CI/CD

Canary Deployments with Flagger Tutorial

Learn how to ship canary releases on Kubernetes using Flagger. Covers the control loop, metric analysis, traffic shifting, and how to roll back automatically when a release misbehaves.

·4 min read · By Codeloom
Intermediate 9 min read

What you'll learn

  • What a canary release actually is
  • How Flagger orchestrates traffic shifting
  • How to write a Canary resource end-to-end
  • How metric analysis triggers rollback
  • Pitfalls around metrics, thresholds, and traffic

Prerequisites

  • Kubernetes basics and a service mesh or ingress controller installed

What and Why

A canary deployment routes a small slice of real traffic to a new version, watches its behavior, and only continues if it stays healthy. If error rates climb or latency spikes, the rollout pauses or reverses. The point is to find problems with one percent of users instead of one hundred percent.

Flagger is a Kubernetes operator that automates this loop. You write one Canary resource describing the target Deployment, the traffic provider, and the success metrics. Flagger handles the rest: it creates a parallel canary Deployment, shifts traffic in steps, queries Prometheus, and either promotes or rolls back.

Mental Model

Flagger turns a normal Deployment into a pair: the stable primary and the new canary. Your Service points at the primary. When you push a new image, Flagger detects the spec change, scales up the canary with the new version, and tells the mesh to send a small percentage of traffic there. Every interval, it runs a query against your metrics provider. If checks pass for several intervals in a row, traffic shifts further. If they fail, traffic snaps back to zero and the canary is scaled down.

The key insight is that your application code does not change. The traffic split happens at the mesh or ingress layer.

Hands-on Example

Assume Linkerd is installed and Prometheus is scraping it. Define a Canary:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: checkout
  namespace: shop
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: checkout
  service:
    port: 80
    targetPort: 8080
  analysis:
    interval: 30s
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
      - name: request-success-rate
        thresholdRange: { min: 99 }
        interval: 1m
      - name: request-duration
        thresholdRange: { max: 500 }
        interval: 1m

When you kubectl set image deployment/checkout checkout=checkout:v2, Flagger picks it up. Every 30 seconds it queries Prometheus, and every successful interval it raises traffic by 10 percent until it hits 50, then promotes.

Push new image
 |
 v
Flagger detects spec change
 |
 v
Scale up canary pods (v2)
 |
 v
For step in [10, 20, 30, 40, 50]:
 shift traffic   ->   mesh splits primary/canary
 wait interval
 query Prometheus
 if metrics fail threshold -> rollback to 0%
 else continue
 |
 v
Promote: copy v2 spec to primary, drain canary
Flagger canary control loop

Common Pitfalls

Bad metrics are the most common failure. If your success-rate query covers all pods rather than just the canary pods, a slow rollout looks healthy because the primary drowns out the signal. Use the metric templates Flagger ships, or write PromQL that filters by the canary’s pod label.

Too-aggressive steps are the second pitfall. A stepWeight: 50 skips the whole point of a canary. Most failures only manifest at noticeable traffic levels; small steps with several intervals give the analysis time to see them.

The third is no real traffic. A canary at three in the morning with zero requests will hit no error budget and promote a broken version. Either run synthetic load during analysis or only deploy in business hours.

Practical Tips

Start with conservative settings: small stepWeight, high threshold (number of consecutive checks), and a low maxWeight like 30. Tighten as you build confidence.

Add webhooks for pre-rollout and load-testing steps. Flagger can call out to a load generator or a smoke-test pod and only continue if it returns success.

Treat request-success-rate and request-duration as the floor, not the ceiling. Add custom metrics that match your domain, like checkout completion rate or job throughput.

Monitor Flagger itself. Its events and metrics show how often canaries fail, which is a useful signal about your release quality over time.

Wrap-up

Flagger gives you progressive delivery without writing your own control loop. Define a Canary, point it at a Deployment, and let it shift traffic while watching real metrics. The patterns that matter are honest metrics, small steps, and meaningful traffic during analysis. Get those right and rollbacks become boring instead of frightening.