Event-Driven Architecture

Building Reactive Systems with Event Sourcing and CQRS

Explore how capturing every state change as an immutable event enables perfect audit trails and high-performance read-optimized data models.

ArchitectureIntermediate11 min read

In this article

Beyond State: The Case for Persistence via Events

The Loss of Context in CRUD
The Event as the Source of Truth

Engineering the Event Store and Stream

Structuring Immutable Data
Managing Stream Integrity

Performance at Scale: Materialized Views and Projections

Decoupling Reads from Writes
Handling Projection Latency

Navigating the Complexity of Distributed History

Replaying History and Snapshots
When Not to Use Event Sourcing

Beyond State: The Case for Persistence via Events

In traditional application development, we typically focus on the current state of our data. When a user updates their profile or completes a purchase, our primary action is to overwrite the existing record in a relational database. While this model is intuitive, it ignores the critical journey of how that data reached its current form.

Traditional CRUD systems effectively suffer from data amnesia by discarding history. Once a row is updated or deleted, the previous context is gone unless you have implemented complex and often incomplete audit logs. This makes debugging historical anomalies or performing detailed business forensics an uphill battle for any engineering team.

Event Sourcing flips this paradigm by treating every change as a first-class citizen. Instead of storing the final snapshot of an object, we store a chronological sequence of immutable events that describe every transition. The current state is no longer the primary source of truth but is instead a derived calculation based on the history of events.

Event Sourcing ensures that you never lose information by making the history of the system the most important data asset you own.

By focusing on events, we capture intent rather than just side effects. Knowing that an item was removed from a cart after a discount code failed provides far more business value than simply seeing an empty cart record. This level of detail allows developers to reconstruct any past state with mathematical certainty.

The Loss of Context in CRUD

Most relational models represent a snapshot in time which works well for simple applications but fails in high-stakes environments like finance or healthcare. In these domains, the path to a result is often as important as the result itself. When you only store the current balance of a bank account, you lose the granular details of every transaction that led to that number.

Relying on state-based updates also introduces significant risks regarding race conditions and concurrency. Multiple services attempting to update the same row can lead to lost updates or inconsistent data if locking mechanisms are not perfectly tuned. Events, being append-only, naturally mitigate many of these common synchronization headaches.

The Event as the Source of Truth

When events are the source of truth, the database becomes an append-only log of facts. A fact is something that has happened in the past and cannot be changed or retracted. This immutability simplifies the architecture by removing the need for complex update and delete logic at the core storage level.

Because the log is immutable, it serves as a perfect audit trail that is guaranteed to be accurate. There is no discrepancy between what the application did and what the audit log says because the audit log is the application data. This alignment provides immense confidence during regulatory audits or security post-mortems.

Engineering the Event Store and Stream

Implementing an event-driven system requires a specialized storage approach often referred to as an Event Store. Unlike a general-purpose database, an Event Store is optimized for appending new events and reading them back in the order they occurred. It must handle high write throughput while ensuring that the sequence of events remains strictly ordered per entity.

Each event in the store should contain enough metadata to be self-describing. This typically includes a unique event identifier, the type of event, a timestamp, and the version of the entity it applies to. The payload should be a serialized representation of the change, such as a JSON object, that captures the specific parameters of the action taken.

pythonDefining an Immutable Event Structure

1import uuid
2from datetime import datetime
3from dataclasses import dataclass, field
4
5@dataclass(frozen=True)
6class AccountDebited:
7    # Use frozen=True to enforce immutability in the application layer
8    event_id: str = field(default_factory=lambda: str(uuid.uuid4()))
9    account_id: str
10    amount: int
11    currency: str
12    occurred_at: str = field(default_factory=lambda: datetime.utcnow().isoformat())
13
14# Example of creating a record of a financial transaction
15debit_event = AccountDebited(
16    account_id="acc-789",
17    amount=500,
18    currency="USD"
19)

The integrity of the event stream is maintained through optimistic concurrency control. When appending a new event, the system checks the version of the entity to ensure no other events have been added in the interim. This prevents two conflicting actions from being recorded simultaneously while maintaining high performance.

Structuring Immutable Data

Events should be named in the past tense to reflect that they are historical facts. Names like OrderCreated, PaymentProcessed, or EmailChanged clearly communicate what occurred. This naming convention helps developers and stakeholders speak a common language when discussing system behavior.

The schema of an event should be as granular as possible to avoid ambiguity. If a user updates their entire profile, it is often better to emit specific events like AddressUpdated and PhoneNumberChanged rather than a generic ProfileUpdated event. This granularity makes it easier to build specific read models later on.

Managing Stream Integrity

Each aggregate, such as a specific user or order, has its own stream of events within the store. Keeping these streams distinct allows the system to scale horizontally by partitioning data across different nodes based on the aggregate ID. It also simplifies the process of replaying events for a single entity without scanning the entire database.

Strict ordering is only required within the context of a single stream. While the global order of all events across the whole system might be interesting, it is usually not necessary for maintaining consistency. Focusing on per-stream ordering significantly reduces the coordination overhead in a distributed environment.

Performance at Scale: Materialized Views and Projections

A common concern with event sourcing is the performance cost of calculating state on the fly. If an account has ten thousand transactions, reading all of them every time you need to check a balance is inefficient. We solve this by using projections to create read-optimized materialized views.

Projections are background processes that listen to the event stream and update a separate database optimized for queries. This separation of concerns allows the write side to focus on fast appends while the read side focuses on fast lookups. This pattern is widely known as Command Query Responsibility Segregation or CQRS.

Write Side: Optimized for append-only performance and domain integrity.
Read Side: Optimized for complex queries, full-text search, and high-speed retrieval.
Decoupling: Teams can scale the read and write instances independently based on traffic patterns.
Flexibility: You can create new read models at any time by replaying old events into a new database schema.

Because the read models are derived from the event log, they are inherently disposable. If your query requirements change or you want to switch from a relational database to a document store, you simply reset the projection and replay the history. This provides a level of architectural flexibility that is impossible to achieve with traditional migrations.

Decoupling Reads from Writes

In a CQRS architecture, the command side only validates business logic and appends events. It does not return the updated state to the user, but rather acknowledges that the event has been recorded. This asynchronous nature allows the system to handle massive bursts of traffic without slowing down the user experience.

The read side can be implemented using any technology that fits the specific query needs. You might use an Elasticsearch index for searching products, a Redis cache for user sessions, and a Neo4j graph for social connections. All of these are kept in sync by the same underlying stream of events.

Handling Projection Latency

Since projections happen after an event is stored, there is a period of eventual consistency where the read model might be slightly behind the event store. In most web applications, this delay is measured in milliseconds and is often unnoticeable to the end user. However, developers must design the frontend to handle this reality gracefully.

Techniques such as optimistic UI updates or tracking the last seen version of an entity can mitigate the impact of eventual consistency. If a user submits a change, the UI can immediately reflect that change locally while waiting for the background projection to catch up. This provides a snappy experience while maintaining a robust backend.

Navigating the Complexity of Distributed History

While event sourcing offers powerful benefits, it introduces unique challenges that do not exist in CRUD systems. One major challenge is managing the growth of the event store over time. As the system ages, the number of events can grow into the millions or billions, making complete replays for every state calculation impractical.

We address the growth problem through a technique called snapshotting. A snapshot is a recording of the state at a specific point in time, such as every hundredth event. When the system needs to reconstruct the current state, it starts from the latest snapshot and only applies the small handful of events that occurred afterward.

javascriptApplying Events to Build State

1function reduceState(events, initialState = { balance: 0 }) {
2    // This function acts as a projection logic for an account
3    return events.reduce((state, event) => {
4        switch (event.type) {
5            case 'AccountDeposited':
6                return { ...state, balance: state.balance + event.amount };
7            case 'AccountWithdrawn':
8                return { ...state, balance: state.balance - event.amount };
9            default:
10                return state;
11        }
12    }, initialState);
13}
14
15const history = [
16    { type: 'AccountDeposited', amount: 100 },
17    { type: 'AccountWithdrawn', amount: 30 }
18];
19const currentState = reduceState(history);

Versioning and schema evolution also require careful planning. Since events are immutable, you cannot simply go back and change the structure of an event that was recorded three years ago. You must design your consumers to be forward-compatible or implement a transformation layer that handles different versions of the same event type.

Replaying History and Snapshots

Snapshots are an optimization, not a replacement for the event log. The original events must always be preserved because they contain the granular details that the snapshot might have summarized. If you discover a bug in your snapshot logic, you can always delete the snapshots and rebuild them from the original events.

Replaying history is also a powerful tool for testing. You can take a sequence of events from a production bug report and run them through your local development environment to see exactly how the system reached a broken state. This turns every production failure into a perfectly reproducible test case.

When Not to Use Event Sourcing

Event sourcing is not a silver bullet and adds significant cognitive load to a project. For simple applications with straightforward data needs and low complexity, the overhead of managing event stores and projections may outweigh the benefits. It is a tool specifically designed for complex domains where history and auditability are non-negotiable.

If your system does not require a deep audit trail or the ability to view data at different points in time, a standard relational model is likely more efficient. Always evaluate the trade-offs of eventual consistency and increased architectural complexity before committing to an event-sourced approach. Use it where the domain logic is rich and the cost of losing history is high.

Ensuring Data Integrity with Saga and Transactional Outbox Patterns Monitoring and Tracing Complex Asynchronous Event Workflows