Skip to content
C Codeloom
AWS

AWS VPC, Subnets, and Routing Explained

A grounded explanation of how AWS VPC components fit together: subnets, route tables, internet gateways, NAT, and the rules that determine where a packet actually goes.

·5 min read · By Codeloom
Intermediate 9 min read

What you'll learn

  • What a VPC actually is at the packet level
  • Public vs private subnets and why the distinction matters
  • How route tables, IGWs, and NAT gateways work
  • Security groups vs NACLs
  • A reference Terraform layout you can copy

Prerequisites

  • Some prior AWS exposure and IP/CIDR familiarity

What and Why

A VPC (Virtual Private Cloud) is your own software-defined network inside AWS. Every EC2 instance, Lambda ENI, and RDS database lives in a VPC. Get the VPC wrong and you fight cryptic timeouts for months. Get it right and you stop thinking about networking entirely.

The good news: there are only about six concepts. The trick is understanding how they compose.

Mental Model

A VPC is a private IP range (a CIDR block) carved into subnets. Each subnet lives in one availability zone. Subnets are made public or private by their route table:

  • A subnet whose route table has 0.0.0.0/0 -> Internet Gateway is public.
  • A subnet whose route table has 0.0.0.0/0 -> NAT Gateway is private with outbound internet.
  • A subnet with no default route is fully isolated.

Security is enforced at two layers: security groups (stateful, attached to ENIs) and network ACLs (stateless, attached to subnets).

            Internet
             |
             v
            IGW
             |
 +-----------+-----------+
 | route: 0.0.0.0/0->IGW |
 |  Public subnet 10.0.1.0/24
 |   [ALB] [NAT GW]
 +-----------+-----------+
             |
 +-----------v-----------+
 | route: 0.0.0.0/0->NAT |
 |  Private subnet 10.0.2.0/24
 |   [App EC2] [RDS]
 +-----------------------+
VPC routing layout

Hands-on Example

A minimal Terraform layout:

resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
}

resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.main.id
}

resource "aws_subnet" "public_a" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.1.0/24"
  availability_zone       = "us-east-1a"
  map_public_ip_on_launch = true
}

resource "aws_subnet" "private_a" {
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.0.11.0/24"
  availability_zone = "us-east-1a"
}

resource "aws_eip" "nat" { domain = "vpc" }

resource "aws_nat_gateway" "nat_a" {
  allocation_id = aws_eip.nat.id
  subnet_id     = aws_subnet.public_a.id
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.main.id
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.igw.id
  }
}

resource "aws_route_table" "private" {
  vpc_id = aws_vpc.main.id
  route {
    cidr_block     = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.nat_a.id
  }
}

resource "aws_route_table_association" "pub_a" {
  subnet_id      = aws_subnet.public_a.id
  route_table_id = aws_route_table.public.id
}

resource "aws_route_table_association" "priv_a" {
  subnet_id      = aws_subnet.private_a.id
  route_table_id = aws_route_table.private.id
}

Run an Application Load Balancer in the public subnet and your EC2 / ECS tasks in the private subnet. Outbound calls (pulling from npm, calling third-party APIs) flow through the NAT.

In production, replicate this across at least two AZs (us-east-1a and us-east-1b) for fault tolerance.

Common Pitfalls

Putting databases in public subnets. Even with a tight security group, this surfaces them to the internet at L3. Use private subnets for everything that does not need direct inbound from the internet.

One NAT gateway for everything. A NAT gateway in us-east-1a is a single AZ point of failure for outbound traffic. Run one per AZ. (Note: NAT gateways are not cheap — about $0.045/hr plus data processing — so single-AZ NAT is a common cost compromise for non-prod.)

Overlapping CIDRs. Future you wants to VPC-peer with another account. If both use 10.0.0.0/16, peering is impossible. Pick a unique range up front.

NACLs as firewalls. NACLs are stateless. If you allow inbound 443, you must also allow outbound ephemeral ports 1024-65535. Most teams should leave NACLs at default-allow and enforce policy through security groups.

Forgetting endpoints. Hitting S3 from a private subnet over the NAT costs data processing fees. A VPC endpoint for S3 is free and keeps the traffic on the AWS backbone.

Practical Tips

Tag every subnet with tier = public|private|data. Makes Terraform queries and audits easy.

Reserve a /16 per VPC and a /24 per subnet by default. That gives you 256 IPs per subnet — enough for most services, but small enough that you can fit 256 subnets in a VPC.

Enable VPC Flow Logs to S3 or CloudWatch. The first time you debug a “connection times out but security groups look right” issue, flow logs will pay for themselves.

For service-to-service traffic between VPCs, prefer PrivateLink or Transit Gateway over peering. Both scale much better than a mesh of peerings.

Run aws ec2 describe-route-tables to see exactly which routes a subnet uses. The console UI hides the join; the CLI shows the truth.

Wrap-up

A VPC is a CIDR plus subnets plus route tables plus gateways. Public subnets route to an IGW, private subnets route to a NAT, and security groups enforce policy at the instance ENI level. Replicate across AZs for resilience, plan CIDRs to avoid future overlap, and turn on flow logs early. Once you have built this layout once with Terraform, every future AWS networking question reduces to “which route table and which security group?” — which is exactly the mental model AWS wants you to have.