FastAPI Rate Limiting: A Practical Tutorial
Add rate limiting to FastAPI using slowapi and Redis: token buckets vs fixed windows, per-user and per-IP limits, returning proper headers, and avoiding the most common production mistakes.
What you'll learn
- ✓Why rate limiting belongs in your API
- ✓Fixed window vs sliding window vs token bucket
- ✓Wiring slowapi into a FastAPI app
- ✓Per-IP and per-user limits with Redis storage
- ✓Returning Retry-After and standard headers
- ✓Pitfalls behind proxies and load balancers
Prerequisites
- •Comfort writing FastAPI routes
Rate limiting is one of those features that feels boring until your API gets hammered. Adding it early is much cheaper than retrofitting it after an incident.
What and Why
Rate limiting caps how many requests a single client can make in a window of time. It protects you from abusive clients, runaway loops in your own front-end, and surprise costs from downstream services. It also gives you a clear contract: “100 requests per minute per API key” is easier to support than “be reasonable.”
Mental Model
Three algorithms cover almost every use case. A fixed window counts requests in discrete intervals (the cheap default). A sliding window smooths the edges so a burst at the boundary cannot use two windows at once. A token bucket refills at a steady rate and allows controlled bursts. Most libraries pick one and expose it as a decorator or middleware.
Storage matters too. In-memory counters work on a single process; Redis (or another shared store) is required as soon as you run more than one worker.
Hands-on Example
slowapi is the most popular FastAPI integration. With Redis as backend:
from fastapi import FastAPI, Request, Depends
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.errors import RateLimitExceeded
from slowapi.util import get_remote_address
limiter = Limiter(
key_func=get_remote_address,
storage_uri="redis://localhost:6379",
default_limits=["100/minute"],
)
app = FastAPI()
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
@app.get("/public")
@limiter.limit("10/minute")
async def public(request: Request):
return {"ok": True}
def user_key(request: Request) -> str:
user = request.headers.get("X-API-Key", "")
return user or get_remote_address(request)
@app.post("/expensive")
@limiter.limit("5/minute", key_func=user_key)
async def expensive(request: Request):
return {"queued": True}
When a client exceeds its limit, slowapi returns 429 Too Many Requests with a Retry-After header.
Common Pitfalls
The first pitfall is using get_remote_address behind a proxy. Without X-Forwarded-For handling, every request looks like it came from the load balancer, and you rate-limit yourself off the internet. Configure your reverse proxy to pass the client IP and read it with request.client.host only after Uvicorn’s --proxy-headers is enabled.
The second is in-memory storage with multiple workers. Each worker maintains its own counter, so a “10/minute” limit becomes “10 per worker per minute.” Use Redis.
The third is limiting the wrong key. Limiting by IP punishes shared networks (offices, mobile carriers). Prefer API keys or authenticated user IDs where you have them, with IP as a fallback.
The fourth is forgetting to expose limit info. Clients cannot back off intelligently without X-RateLimit-Limit, X-RateLimit-Remaining, and Retry-After headers.
Practical Tips
Set generous defaults globally and tighter limits on expensive routes individually. Whitelist internal services by checking a header or source IP before the limiter. Log 429s so you can spot abusive patterns and legitimate clients who need a higher tier.
For very high traffic, consider doing rate limiting at the edge (NGINX, Cloudflare, or your API gateway) and using FastAPI’s limiter only as a safety net.
Wrap-up
Rate limiting in FastAPI is mostly about choosing a sensible algorithm, picking a stable key per client, and using a shared store from day one. Get the basics right and you have a calmer API, happier users, and a much quieter on-call rotation.
Related articles
- FastAPI FastAPI CORS: A Practical Tutorial
Configure CORS in FastAPI without security holes: how the browser preflight works, which origins and headers to allow, credentials and cookies, and the most common misconfigurations to avoid.
- FastAPI FastAPI OpenAPI Customization: A Practical Tutorial
Tailor FastAPI's auto-generated OpenAPI schema: tags, summaries, examples, response models, custom operation IDs, security schemes, and a custom Swagger UI your team will actually use.
- FastAPI FastAPI: Async Routes and Dependency Injection
A practical guide to async path operations and Depends() in FastAPI — when async actually helps, per-request DB sessions, auth dependencies, and how sub-dependencies compose.
- FastAPI FastAPI + SQLAlchemy: Your First Database-Backed API
A practical guide to FastAPI with SQLAlchemy 2.0 — typed models with Mapped and mapped_column, sessionmaker, get_db dependency, CRUD endpoints, and where Alembic fits.