Kubernetes Networking: Services, kube-proxy, and CNI Plugins

Intermediate 10 min read

What you'll learn

✓The pod network model
✓What a CNI plugin does
✓Service types and kube-proxy modes
✓DNS-based service discovery
✓Network policy basics

Prerequisites

•Familiar with terminals and YAML

What and Why

Kubernetes networking is famously confusing. Pods get IPs. Services get other IPs. There is a DNS server. There is kube-proxy. There is a CNI plugin. Once you see how the pieces compose, the model is simple and elegant; before that, every connection issue feels like sorcery.

You need this map because every real outage eventually walks through the network. DNS resolves the wrong thing, a service has no endpoints, a network policy quietly drops packets - knowing where to look saves hours.

Mental Model

Kubernetes mandates four rules: every pod gets its own IP, all pods can reach each other without NAT, nodes can reach pods, and a pod sees the same IP others see. A CNI plugin (Calico, Cilium, Flannel, AWS VPC CNI) is what makes those rules true in your specific environment.

Client -> Service VIP (10.96.0.42)
          |
 kube-proxy / kernel (iptables, ipvs, or eBPF)
          |
      +--------+--------+
      v        v        v
    Pod A    Pod B    Pod C    (endpoints)
      |        |        |
      +--- CNI plugin gives each a routable IP ---+

Packets from client to pod

A Service is a stable virtual IP plus a label selector. The endpoints controller watches pods that match and updates a list. kube-proxy programs the kernel so packets to the VIP get DNAT’d to a random endpoint. The CNI plugin handles the underlying pod-to-pod connectivity.

Hands-on Example

Deploy a simple workload and expose it.

apiVersion: apps/v1
kind: Deployment
metadata: { name: echo }
spec:
  replicas: 3
  selector: { matchLabels: { app: echo } }
  template:
    metadata: { labels: { app: echo } }
    spec:
      containers:
        - name: echo
          image: ealen/echo-server:0.7.0
          ports: [{ containerPort: 80 }]
---
apiVersion: v1
kind: Service
metadata: { name: echo }
spec:
  selector: { app: echo }
  ports: [{ port: 80, targetPort: 80 }]

From any pod, curl http://echo.default.svc.cluster.local works thanks to CoreDNS. Inspect what kube-proxy programmed:

kubectl get endpoints echo
sudo iptables -t nat -L KUBE-SERVICES -n | grep echo

To expose externally on a cloud, change the type:

spec:
  type: LoadBalancer    # cloud LB -> NodePort -> Service VIP -> pod

Lock down traffic with a NetworkPolicy (requires a CNI that supports it like Calico or Cilium):

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: { name: echo-from-web }
spec:
  podSelector: { matchLabels: { app: echo } }
  policyTypes: [Ingress]
  ingress:
    - from: [{ podSelector: { matchLabels: { app: web } } }]
      ports: [{ protocol: TCP, port: 80 }]

Now only pods with app=web may talk to echo on port 80.

Common Pitfalls

Service with no endpoints. Labels on pods do not match the selector, or pods are not ready. kubectl get endpoints is the first check.
DNS NXDOMAIN spam. ndots:5 in /etc/resolv.conf causes 5 lookups for any non-FQDN. Add a trailing dot or set dnsConfig.options to lower ndots.
iptables scale cliff. kube-proxy in iptables mode degrades around 5000 services. Switch to ipvs or use Cilium’s eBPF kube-proxy replacement.
Cloud CNI IP exhaustion. AWS VPC CNI gives every pod a real ENI IP. Subnets can run out fast. Plan CIDRs accordingly or use prefix delegation.
Default-allow networking. Without a default-deny NetworkPolicy, any compromised pod can reach anything. Apply baseline deny per namespace.

Production Tips

Pick a CNI deliberately: Cilium for observability and policy via eBPF, Calico for mature policy and BGP options, AWS VPC CNI for native AWS integration.
Use Topology Aware Routing so traffic prefers same-zone endpoints. Saves cross-AZ bandwidth.
Run CoreDNS with autoscaling and a node-local DNS cache. DNS is on the critical path for every service call.
Adopt a Gateway API based ingress for advanced L7 routing, TLS, and traffic shifting. Plain Ingress is fine but losing momentum.
Use kubectl trace or cilium hubble observe to debug drops. Eyeballing iptables rules works once; tooling scales.

Wrap-up

Kubernetes networking has a small set of moving parts: the CNI plugin owns pod connectivity, the Service plus kube-proxy owns load-balanced virtual IPs, and CoreDNS owns discovery. Once you can name those layers, every network issue becomes “which layer dropped my packet?” Add NetworkPolicy and a thoughtful CNI choice and the platform behaves like a real network instead of a black box.