Canary Deployments with Flagger Tutorial
Learn how to ship canary releases on Kubernetes using Flagger. Covers the control loop, metric analysis, traffic shifting, and how to roll back automatically when a release misbehaves.
What you'll learn
- ✓What a canary release actually is
- ✓How Flagger orchestrates traffic shifting
- ✓How to write a Canary resource end-to-end
- ✓How metric analysis triggers rollback
- ✓Pitfalls around metrics, thresholds, and traffic
Prerequisites
- •Kubernetes basics and a service mesh or ingress controller installed
What and Why
A canary deployment routes a small slice of real traffic to a new version, watches its behavior, and only continues if it stays healthy. If error rates climb or latency spikes, the rollout pauses or reverses. The point is to find problems with one percent of users instead of one hundred percent.
Flagger is a Kubernetes operator that automates this loop. You write one Canary resource describing the target Deployment, the traffic provider, and the success metrics. Flagger handles the rest: it creates a parallel canary Deployment, shifts traffic in steps, queries Prometheus, and either promotes or rolls back.
Mental Model
Flagger turns a normal Deployment into a pair: the stable primary and the new canary. Your Service points at the primary. When you push a new image, Flagger detects the spec change, scales up the canary with the new version, and tells the mesh to send a small percentage of traffic there. Every interval, it runs a query against your metrics provider. If checks pass for several intervals in a row, traffic shifts further. If they fail, traffic snaps back to zero and the canary is scaled down.
The key insight is that your application code does not change. The traffic split happens at the mesh or ingress layer.
Hands-on Example
Assume Linkerd is installed and Prometheus is scraping it. Define a Canary:
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: checkout
namespace: shop
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: checkout
service:
port: 80
targetPort: 8080
analysis:
interval: 30s
threshold: 5
maxWeight: 50
stepWeight: 10
metrics:
- name: request-success-rate
thresholdRange: { min: 99 }
interval: 1m
- name: request-duration
thresholdRange: { max: 500 }
interval: 1m
When you kubectl set image deployment/checkout checkout=checkout:v2, Flagger picks it up. Every 30 seconds it queries Prometheus, and every successful interval it raises traffic by 10 percent until it hits 50, then promotes.
Push new image
|
v
Flagger detects spec change
|
v
Scale up canary pods (v2)
|
v
For step in [10, 20, 30, 40, 50]:
shift traffic -> mesh splits primary/canary
wait interval
query Prometheus
if metrics fail threshold -> rollback to 0%
else continue
|
v
Promote: copy v2 spec to primary, drain canary Common Pitfalls
Bad metrics are the most common failure. If your success-rate query covers all pods rather than just the canary pods, a slow rollout looks healthy because the primary drowns out the signal. Use the metric templates Flagger ships, or write PromQL that filters by the canary’s pod label.
Too-aggressive steps are the second pitfall. A stepWeight: 50 skips the whole point of a canary. Most failures only manifest at noticeable traffic levels; small steps with several intervals give the analysis time to see them.
The third is no real traffic. A canary at three in the morning with zero requests will hit no error budget and promote a broken version. Either run synthetic load during analysis or only deploy in business hours.
Practical Tips
Start with conservative settings: small stepWeight, high threshold (number of consecutive checks), and a low maxWeight like 30. Tighten as you build confidence.
Add webhooks for pre-rollout and load-testing steps. Flagger can call out to a load generator or a smoke-test pod and only continue if it returns success.
Treat request-success-rate and request-duration as the floor, not the ceiling. Add custom metrics that match your domain, like checkout completion rate or job throughput.
Monitor Flagger itself. Its events and metrics show how often canaries fail, which is a useful signal about your release quality over time.
Wrap-up
Flagger gives you progressive delivery without writing your own control loop. Define a Canary, point it at a Deployment, and let it shift traffic while watching real metrics. The patterns that matter are honest metrics, small steps, and meaningful traffic during analysis. Get those right and rollbacks become boring instead of frightening.
Related articles
- CI/CD Blue-Green vs Canary Deployments Explained
Compare blue-green and canary deployment strategies, including how they handle rollback, traffic shifting, and observability, with concrete Kubernetes and AWS examples.
- CI/CD CI/CD Rollback Strategies
An overview of rollback strategies in modern CI/CD: redeploy previous, blue-green flip, canary reverse, database-safe rollbacks, and the trade-offs between speed and safety.
- CI/CD CI/CD Secrets Management Best Practices
Keep API keys, tokens, and database credentials safe in CI/CD with rotation, scoping, secret managers, and OIDC-based authentication.
- CI/CD CI/CD Self-Hosted Runners Tutorial
A practical guide to running your own CI/CD runners. Learn when self-hosting beats cloud runners, how to set them up safely, and how to keep them healthy in production.