AWS Route 53 Routing Policies Explained

Intermediate 9 min read

What you'll learn

✓What each Route 53 routing policy does
✓When to use weighted vs latency vs failover
✓How health checks influence DNS answers
✓How to combine policies with alias records
✓How TTL affects user experience

Prerequisites

•Familiar with shell
•Basic DNS knowledge

What and Why

Amazon Route 53 is a managed DNS service, but unlike a vanilla DNS server, it can decide which answer to return based on rules you configure. Those rules are called routing policies. Picking the right one is the difference between users hitting the nearest healthy region or being sent to a dark backup that timed out an hour ago.

There are seven policies in total: Simple, Weighted, Latency, Failover, Geolocation, Geoproximity, and Multivalue Answer. Each solves a specific problem, from blue/green rollouts to compliance-driven geographic routing.

Mental Model

Think of Route 53 as a programmable answering machine. When a resolver asks for api.example.com, Route 53 looks at the record set, evaluates the policy, optionally checks health, and replies. The answer can change per request, per region, or per caller IP.


  [Resolver Query]
         |
         v
 +-----------------+
 | Route 53 Record |
 +-----------------+
         |
 evaluate policy
         |
 +-------+-------+
 |               |
[Health Check]  [Geo/Latency Map]
 |               |
 +-------+-------+
         |
         v
   [Best Answer IP]

Route 53 decision flow for a DNS query

Hands-on Example

Let us configure a weighted policy for a canary release. Suppose you run app.example.com and want 10 percent of traffic to hit a new ALB.

aws route53 change-resource-record-sets \
  --hosted-zone-id Z123 \
  --change-batch file://canary.json

canary.json contains two records with SetIdentifier and Weight:

{
  "Changes": [
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "app.example.com",
        "Type": "A",
        "SetIdentifier": "stable",
        "Weight": 90,
        "AliasTarget": {
          "HostedZoneId": "Z35SXDOTRQ7X7K",
          "DNSName": "stable-alb.elb.amazonaws.com",
          "EvaluateTargetHealth": true
        }
      }
    },
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "app.example.com",
        "Type": "A",
        "SetIdentifier": "canary",
        "Weight": 10,
        "AliasTarget": {
          "HostedZoneId": "Z35SXDOTRQ7X7K",
          "DNSName": "canary-alb.elb.amazonaws.com",
          "EvaluateTargetHealth": true
        }
      }
    }
  ]
}


 app.example.com
      |
 +----+----+
 | weights |
 +----+----+
      |
 +----+----+
 | 90 / 10 |
 +-+----+-+
   |    |
stable canary
 ALB    ALB

Weighted routing splits traffic by ratios

For a failover policy, you mark one record PRIMARY and another SECONDARY. As long as the primary health check is passing, Route 53 returns the primary IP. The moment health flips, it switches to the secondary.

Latency routing uses AWS internal latency measurements between AWS regions and resolver IPs. Geolocation reads the resolver country and returns a matching record, with a default record as fallback for unknown countries.

Common Pitfalls

Forgetting EvaluateTargetHealth on alias records, which silently disables failover even when health checks exist.
Setting TTLs too high. A 1 hour TTL means clients may keep using a dead IP for an hour after failover.
Mixing geolocation and latency without realizing geolocation wins when both match.
Treating Route 53 weights as exact percentages. They are probabilistic over many queries, not per-request guarantees.
Using a single health check endpoint that returns 200 even when the database is down. Health checks must reflect real user health.

Practical Tips

Always pair failover routing with a private health check on a deep endpoint, not just / returning HTML. For canary releases, combine weighted routing with CloudWatch alarms so you can automatically reset weights if error rates climb.

Keep TTLs short (30 to 60 seconds) for records that participate in failover. For static records like www pointing at CloudFront, longer TTLs are fine.

If you have strict data residency, geolocation routing is the safer choice over geoproximity because it uses country codes from the resolver rather than approximate IP geolocation, which can be wrong.

Multivalue answer routing is a budget alternative to an ELB for small services. It returns up to eight healthy records and lets the client retry, giving you basic load distribution without a load balancer.

Wrap-up

Route 53 routing policies turn DNS from a static lookup into a dynamic traffic controller. Simple covers single endpoints, Weighted enables canaries, Latency optimizes performance, Failover handles outages, and Geolocation handles compliance. The key habits are aggressive health checks, low TTLs, and pairing alias records with EvaluateTargetHealth. Master these and you can build globally resilient systems with nothing but DNS configuration.