System Design: Design a News Feed (Twitter/Facebook)

Intermediate 12 min read

What you'll learn

✓Pick fan-out on write vs fan-out on read
✓Handle celebrity users without melting the cluster
✓Store and serve timelines from cache
✓Rank posts beyond reverse chronological order
✓Defend choices in a follow-up question round

Prerequisites

•Familiarity with key-value stores and Redis.
•Comfort with HTTP APIs. See [What is REST](/blog/what-is-rest).

A feed is the home screen for Twitter, Facebook, LinkedIn, Instagram. Each user sees a personalized stream assembled from people and topics they follow. The whole product runs or dies on how fast that stream loads, so feed design is mostly an exercise in pre-computation.

Functional Requirements

Post text, images, or video.
Follow and unfollow other users.
Fetch a personalized timeline, paginated, newest or ranked first.
Like, comment, repost.
Notifications when followed users post.

Non-Functional Requirements

500M daily active users, 1B posts per day, 50B feed reads per day.
Read QPS: about 600k average, 2M peak.
Write QPS: about 12k average.
Timeline read latency: p99 under 200 ms.
Eventual consistency is acceptable — a few seconds of delay before a follower sees a post is fine.

High-Level Architecture

Post service writes new posts to a posts store and emits an event.
Fan-out service consumes the event and pushes the new post into each follower’s timeline cache.
Timeline service serves the user’s pre-computed timeline from cache, hydrates post content from the posts store, and applies ranking.
Follower graph service answers “who follows X” and “who does X follow.”
Media service handles images and video uploads, stores them in object storage, serves via CDN.
Ranking service scores candidate posts using engagement signals.

Data Model

CREATE TABLE posts (
  post_id     BIGINT PRIMARY KEY,
  author_id   BIGINT NOT NULL,
  body        TEXT,
  media_url   TEXT,
  created_at  TIMESTAMP NOT NULL
);

CREATE TABLE follows (
  follower_id BIGINT,
  followee_id BIGINT,
  created_at  TIMESTAMP,
  PRIMARY KEY (follower_id, followee_id)
);

Timeline in Redis as a sorted set per user:

key:   timeline:{user_id}
score: created_at (or rank score)
value: post_id
cap:   most recent ~800 entries

The posts table is sharded by post_id (or author_id). The follows table is sharded by follower_id for “who do I follow” reads.

Key APIs

POST /api/v1/posts
  body: text, media_id (optional)
  returns: post_id

GET /api/v1/feed?cursor=<opaque>&limit=20
  returns: posts[], next_cursor

POST /api/v1/follow
  body: target_user_id

GET /api/v1/users/:id/posts
  returns: author_timeline

Use cursor pagination, not offset — see REST API Design Best Practices for why offset breaks under writes.

Fan-out on Write vs Fan-out on Read

Fan-out on write (push). When a user posts, write the post id into every follower’s timeline cache. Reads are O(1) — just fetch the sorted set. Writes are O(followers). For a user with 100 followers this is fine. For Cristiano Ronaldo with 600M followers, it is a disaster.

Fan-out on read (pull). Store posts by author. At read time, find the authors a user follows and merge their recent posts. Reads are expensive, writes are cheap. Works fine for very high-follow accounts but kills the read path for normal users.

Hybrid. Push for normal users. For celebrities (over some threshold like 100k followers), do not push — instead, at read time, merge their recent posts into the pre-computed timeline. This is the model that real systems converge on.

Scaling and Tradeoffs

Timeline cache. Each user gets a capped sorted set in Redis. Cap at roughly 800 post ids — covers infinite scroll without unbounded growth. Evict cold users with LRU.

Hydration. The timeline cache stores post ids only. On read, batch-fetch the post bodies from the posts store. Cache hot post bodies in a separate Redis namespace. This separation lets a deleted post disappear instantly without rewriting every follower’s timeline.

Sharding posts. Shard the posts table by post_id for even distribution. Shard the follows table by follower_id so “who do I follow” is one hop, and keep a secondary by followee_id for “who follows me.” See SQL Indexes and Performance for the indexing patterns this implies.

Ranking. Reverse chronological is the default. To rank, the service scores recent candidate posts with a model that uses recency, author affinity, engagement rate, and content type. Ranking happens at read time on a small candidate set (a few hundred posts), not across the whole graph.

Notifications. Same fan-out service can push notification events into a queue, which a worker turns into pushes, emails, or in-app badges.

Media. Never proxy media through the app. Upload to object storage with a signed URL, serve via CDN. Generate thumbnails async.

Cold start. A new user has an empty timeline. Seed it with topic-based content or popular posts from suggested follows.

What to Say in an Interview

State the read-write ratio first. Feeds are read-heavy, which forces pre-computation.
Pick the hybrid fan-out model and justify the celebrity carve-out. This is the single most-tested concept in feed interviews.
Separate timeline storage (post ids) from post storage (bodies). Mention deletion as the motivation.
Mention ranking briefly, but stay in scope. Do not pitch a deep learning model unless asked.
Cover one failure mode: what happens when the fan-out queue backs up. Answer: posts are delayed, not lost, and reads still work.

Wrap up

The feed is an exercise in moving work from read time to write time, then carving out the cases where that breaks. Hybrid fan-out, capped per-user sorted sets in Redis, and a separate hydration step for post bodies will carry you through any feed design interview without surprises.