Skip to content
C Codeloom
FastAPI

FastAPI Background Tasks and Celery

When FastAPI BackgroundTasks are enough, when you need Celery, and how to wire jobs that survive crashes, retries, and scale.

·5 min read · By Codeloom
Intermediate 10 min read

What you'll learn

  • What FastAPI BackgroundTasks really do
  • Why Celery exists and when you need it
  • How to wire FastAPI to Celery cleanly
  • Patterns for retries, idempotency, and observability
  • Common deployment pitfalls

Prerequisites

  • Comfortable with FastAPI and HTTP APIs

“Send the email after the response” sounds simple until the process restarts mid-send. FastAPI ships a BackgroundTasks helper that handles the easy version. Anything more serious wants a real job queue, and Celery is still the dominant choice. This post is about knowing when each one is right and wiring them so jobs do not silently vanish.

What FastAPI BackgroundTasks actually do

BackgroundTasks runs after the response is sent, in the same process, on the same event loop. The function runs once. There is no queue, no persistence, no retry. If the process dies between sending the response and finishing the task, the work is lost.

from fastapi import FastAPI, BackgroundTasks

app = FastAPI()

@app.post("/signup")
async def signup(email: str, bg: BackgroundTasks):
    user = await create_user(email)
    bg.add_task(send_welcome_email, email)
    return {"id": user.id}

Use this when the task is fast, idempotent, and “best effort” acceptable: cache priming, log shipping to a sink that buffers, fire-and-forget metrics. Do not use it for anything where loss matters.

When you actually need a real queue

You need a queue (Celery, RQ, Dramatiq, Arq, SQS) when any of these apply:

  • The task must complete even if the web process restarts.
  • The task takes more than a few seconds.
  • You want retries with backoff.
  • Multiple workers should share the load.
  • You want to schedule periodic jobs.
  • You want visibility into queue depth and failures.

Celery has the largest ecosystem, plays well with Redis or RabbitMQ, supports retries, scheduling (Celery Beat), priorities, and rate limits. It is overkill for “send one email”; it is right for “process this image, then notify, then update billing.”

Mental model

BackgroundTasks:
request -> response
            \__ task runs in-process (lost on crash)

Celery:
request -> enqueue(broker)
            |
            v
      worker pool --> run task --> result backend
            ^
            |__ retries, scheduled jobs, monitoring
BackgroundTasks vs Celery

The Celery broker is the durable boundary. Once a task is in Redis or RabbitMQ, the web process can die without losing work.

Hands-on: Celery wired to FastAPI

tasks.py:

from celery import Celery

celery_app = Celery(
    "app",
    broker="redis://redis:6379/0",
    backend="redis://redis:6379/1",
)

@celery_app.task(bind=True, autoretry_for=(Exception,), retry_backoff=True, max_retries=5)
def send_welcome_email(self, email: str):
    deliver(email)

main.py:

from fastapi import FastAPI
from .tasks import send_welcome_email

app = FastAPI()

@app.post("/signup")
async def signup(email: str):
    user = await create_user(email)
    send_welcome_email.delay(email)
    return {"id": user.id, "job": "queued"}

Run a worker separately: celery -A tasks worker --loglevel=info. The FastAPI process and the worker process are independent; you can scale them on different curves.

Retries and idempotency

Retries assume the task can run more than once safely. Make tasks idempotent. The common pattern is an idempotency key stored when the task starts.

@celery_app.task(bind=True, autoretry_for=(TransientError,), retry_backoff=True)
def charge(self, payment_id: str):
    if already_charged(payment_id):
        return
    do_charge(payment_id)
    mark_charged(payment_id)

Without idempotency, retries cause duplicate charges, duplicate emails, duplicate webhooks. Every job system gets this wrong at some point; design for it now.

Scheduling and periodic jobs

Celery Beat schedules jobs on a cron-like timer. It is one process feeding tasks into the queue at the right times.

celery_app.conf.beat_schedule = {
    "nightly-cleanup": {
        "task": "tasks.cleanup_expired",
        "schedule": 60 * 60 * 24,
    },
}

If you only need scheduling and not full async fanout, a smaller library or a Kubernetes CronJob may be simpler.

Observability

Track queue depth (LLEN celery in Redis), task latency, success rate, and failure reasons. Use Flower for an out-of-the-box dashboard, or scrape Celery’s events into Prometheus. A queue without metrics is a queue you find out about when customers email you.

Log task IDs from FastAPI and from workers. Correlating a request to its background work is otherwise painful.

Common pitfalls

  • Using BackgroundTasks for “send the email” when the email is the whole point. Loss matters; use a queue.
  • Importing the FastAPI app inside the Celery worker. You only need the task module; pulling the whole web app slows worker startup and creates circular imports.
  • Forgetting acks_late=True and task_reject_on_worker_lost=True. By default, a worker that crashes mid-task loses the task; these flags requeue it.
  • Sharing DB sessions between FastAPI and Celery. Open per-task sessions; they have different lifecycles.
  • Returning large blobs from tasks. The result backend stores them; storage costs and latency add up. Save to S3, return a pointer.
  • Running Beat on multiple replicas. Two schedulers double-fire jobs. Use a lock or a single Beat pod.

Practical tips

  • Start with BackgroundTasks. Promote to Celery the moment a task starts to matter.
  • Set per-task time limits. A runaway task should not hang a worker forever.
  • Tag tasks with priority queues (celery -Q high,default). Important work skips ahead.
  • For async workers, look at arq or dramatiq if Celery feels heavy. Pick the smallest tool that meets your guarantees.
  • Run a load test that kills the worker mid-task. If work is lost, your guarantees are weaker than you think.

Wrap-up

BackgroundTasks is great for trivial fire-and-forget. The moment correctness, retries, or scale matter, move to a real queue, and Celery still does the job well. Pair it with idempotency, monitoring, and proper ack settings, and your “after the response” work will actually happen.