Skip to content
C Codeloom
System Design

Event-Driven Architecture: The Pragmatic Introduction

What event-driven architecture really gives you, when to choose it, and the operational realities of running asynchronous systems at scale.

·5 min read · By Codeloom
Intermediate 10 min read

What you'll learn

  • The difference between events, commands, and messages
  • Why event-driven decouples teams more than code
  • How to design idempotent consumers
  • When event sourcing is overkill
  • How to recover from poison messages

Prerequisites

  • Familiar with how APIs work
  • Basic messaging concepts

What and Why

Event-driven architecture (EDA) is a style where components communicate by emitting and reacting to events instead of calling each other directly. An event is a record of something that happened: OrderPlaced, UserSignedUp, PaymentCaptured. Producers do not know who consumes their events. Consumers do not know who produced them.

The reason teams adopt EDA is not technical elegance. It is organizational. Synchronous APIs require both teams to be online, deployed, and compatible. Events let teams ship independently, add new consumers without coordination, and replay history. The trade-off is operational complexity: asynchronous systems are harder to reason about and harder to debug.

Mental Model

Distinguish three things:

  • Event: a fact about the past. Immutable. Past tense. OrderShipped.
  • Command: an instruction. Future tense. ShipOrder. Has one intended recipient.
  • Message: the transport envelope around either.

A synchronous request-response is a command-plus-reply. An event is fire-and-forget. Mixing them under one “message bus” abstraction is where most EDA implementations get confused.

Architecture

Producer Service                    Consumer Services
                        +--------> Email Service
Checkout -- OrderPlaced ->| Broker  
                        +--------> Inventory Service
                        |
                        +--------> Analytics Pipeline
                        |
                        +--------> Search Indexer

All consumers receive the same event independently.
Producer never knows or cares who is listening.
Event-driven flow with a broker

The components:

  • Broker. Kafka, Pulsar, NATS JetStream, or a managed equivalent. Stores events durably and supports multiple independent consumers.
  • Producers. Services that emit events as part of their normal operation. Usually a database commit and an event emit need to happen together (transactional outbox pattern).
  • Consumers. Services that subscribe to topics and process events. They maintain their own state derived from the events they care about.
  • Schema registry. Holds event schemas (Avro, Protobuf, JSON Schema). Catches breaking changes before they hit production.
  • Dead-letter queue. Where unprocessable messages go after exhausted retries.

Two foundational patterns:

  • Choreography. Each consumer reacts independently. No orchestrator. Good for loosely coupled flows like notifications.
  • Orchestration. A central saga coordinator drives multi-step flows. Better for long-running business processes that need compensation on failure.

Trade-offs

  • Latency vs decoupling. A synchronous call returns the result. An event leaves you wondering whether anything happened. UIs that need immediate confirmation need either a sync API or a way to wait on a derived projection.
  • Eventual consistency everywhere. If your search index is updated by an event consumer, there will be a window where the user creates a thing and cannot find it. Most products tolerate this; some require synchronous indexing.
  • Hidden coupling. Events appear decoupled, but consumers depend on event shape. Breaking the schema breaks every consumer silently. Schema registries and consumer-driven contracts are non-optional.
  • Debugging. A failed user action might touch ten services through five topics. Distributed tracing across asynchronous boundaries is harder than across HTTP. Invest in correlation IDs from day one.
  • Ordering and partitions. Kafka guarantees order within a partition, not across. If your business logic depends on global order, you have a problem. Usually you don’t; you depend on order per entity (per user, per order).
  • Replay cost. A consumer that needs to rebuild state from scratch may need to read months of events. Plan compaction and snapshotting.

The transactional outbox is essential. Naive code writes to the DB then publishes to the broker. If the publish fails, the DB has data and the rest of the world doesn’t. The outbox writes the event into a table in the same transaction; a separate process drains the outbox to the broker. This guarantees at-least-once publication.

Practical Tips

  1. Make consumers idempotent. Brokers deliver at-least-once. Use an event ID and a dedup table or a natural key check.
  2. Version events from day one. Add eventVersion field. Old consumers ignore new fields they do not understand; new consumers handle missing optional fields gracefully.
  3. Use past-tense names. OrderPlaced, not PlaceOrder. The first describes a fact; the second is a command and will tempt people to make it synchronous.
  4. Keep events small and self-contained. Include the data needed to process. Don’t make consumers call back to the producer for details; that recreates the coupling you tried to remove.
  5. Use a dead-letter queue with alerting. A topic that piles up because of a poison message is a silent outage.
  6. Don’t event-source by default. Event sourcing (storing all state changes as events) is powerful but rewrites your data model. Use plain EDA with a normal DB first; reach for event sourcing only when audit, replay, or temporal queries demand it.
  7. Trace across boundaries. Propagate a trace ID in every event header. OpenTelemetry has good support for messaging spans.
{
  "eventId": "9f3c...",
  "eventType": "OrderPlaced",
  "eventVersion": 1,
  "occurredAt": "2026-06-28T10:15:00Z",
  "traceId": "abc123",
  "data": { "orderId": "ord_42", "userId": "u_7", "totalCents": 4200 }
}

Wrap-up

Event-driven architecture trades latency and debuggability for decoupling and resilience. The trade is usually worth it for organizations that move faster than they can coordinate, but it is not free. Invest in schema management, idempotency, tracing, and the transactional outbox before you have ten services and twenty topics. The teams that succeed with EDA treat events as a public API: documented, versioned, and stable. The teams that fail treat them as ad-hoc JSON over a queue.