CQRS Pattern
Combining CQRS with Event Sourcing for Auditable Data Pipelines
Discover how to leverage event streams as the source of truth to rebuild read projections and maintain a complete audit trail of every state change.
In this article
Bridging the Gap Between Storage and Presentation
Traditional application architectures often rely on a single data model that serves both data entry and data retrieval. While this approach works well for simple CRUD operations, it creates significant friction as systems grow in complexity. High-performance applications frequently require data to be presented in shapes that differ drastically from how that data is most efficiently stored.
The Command Query Responsibility Segregation pattern addresses this friction by splitting the application into two distinct paths. Commands handle actions that change state while queries handle requests that read data. By treating these operations as separate concerns, developers can optimize the write path for consistency and the read path for speed and flexibility.
Moving toward a segregated model shifts the mental burden away from complex relational joins and toward a stream-oriented mindset. Instead of forcing a single database schema to satisfy every user interface requirement, you can create specialized read models tailored to specific views. This decoupling allows the system to scale its read and write components independently based on actual traffic patterns.
The primary benefit of CQRS is not just performance but the reduction of cognitive load when dealing with complex business logic that differs from data display requirements.
The Limitations of Unified Models
In a unified model, every new feature often requires adding columns or tables that complicate the primary write database. These additions frequently serve only to speed up a specific search or display page, leading to bloated indexes and slower write performance. Over time, the database becomes a bottleneck where every optimization for reading hurts the efficiency of writing.
When you separate these concerns, the write model only needs to store what is necessary to validate business rules. It does not need to care about how the data will be filtered, sorted, or paginated in the user interface. This separation ensures that the core domain logic remains clean and focused solely on state transitions and data integrity.
Implementing Event Streams as the Source of Truth
Event Sourcing often serves as the ideal foundation for CQRS by capturing every change as a discrete immutable event. Rather than overwriting a row in a table to reflect the current state, the system appends a new event to a sequential log. This log becomes the definitive source of truth from which any current or past state can be derived.
Each event in the stream represents a business fact that has already occurred, such as an item added to a cart or a shipping address updated. Because these events are immutable, they provide a reliable foundation for building downstream projections. The stream acts as a bridge between the command side where logic happens and the query side where data is consumed.
1interface DomainEvent {
2 readonly aggregateId: string;
3 readonly version: number;
4 readonly occurredAt: Date;
5}
6
7class OrderPlaced implements DomainEvent {
8 constructor(
9 public readonly aggregateId: string,
10 public readonly customerId: string,
11 public readonly items: Array<{ sku: string; quantity: number }>,
12 public readonly version: number,
13 public readonly occurredAt: Date = new Date()
14 ) {}
15}
16
17class OrderCancelled implements DomainEvent {
18 constructor(
19 public readonly aggregateId: string,
20 public readonly reason: string,
21 public readonly version: number,
22 public readonly occurredAt: Date = new Date()
23 ) {}
24}The example above demonstrates how events capture the intent and context of a change rather than just the final values. By storing the reason for a cancellation or the specific items in an order, we preserve data that a traditional update-in-place model would lose. This granular history is invaluable for debugging, auditing, and future business analysis.
The Persistence of Intent
Capturing intent is the most powerful aspect of using event streams within a CQRS architecture. When a user changes their mind or an automated process fails, the event stream records the sequence of events exactly as they happened. This creates a high-fidelity audit trail that satisfies strict regulatory requirements without requiring extra development effort.
This persistence layer functions much like a financial ledger where transactions are never deleted. Errors are corrected by issuing new compensating events rather than modifying the history of existing records. This approach eliminates the risk of data loss during concurrent updates and provides a robust foundation for building eventual consistency across distributed services.
Building and Maintaining Read Projections
Read projections are specialized data structures that transform raw event streams into formats optimized for specific queries. A projection worker listens to the event stream and updates a secondary database, such as a document store or a cache, in real time. This allows the application to serve complex queries with simple lookups, bypassing the need for expensive runtime calculations.
Because projections are derived from the event stream, they can be discarded and rebuilt at any time. If the business needs a new way to visualize data, you do not need to perform a risky migration on your primary database. Instead, you create a new projection, replay the historical events from the start of the stream, and populate the new table.
- Projections allow for radical schema flexibility without downtime on the write side.
- Different projections can use different storage technologies like Elasticsearch for search or Redis for real-time dashboards.
- Historical replays enable retroactive bug fixes by correcting projection logic and reprocessing past events.
- Read models can be scaled out horizontally across multiple regions to reduce latency for global users.
1// A projection handler updates a dedicated read-model database
2async function handleOrderEvent(event, db) {
3 switch (event.type) {
4 case 'OrderPlaced':
5 // Increment total sales count and update customer spending
6 await db.collection('customer_stats').updateOne(
7 { customerId: event.customerId },
8 { $inc: { totalOrders: 1, totalSpent: event.amount } },
9 { upsert: true }
10 );
11 break;
12 case 'OrderCancelled':
13 // Adjust stats for the cancellation
14 await db.collection('customer_stats').updateOne(
15 { customerId: event.customerId },
16 { $inc: { totalOrders: -1, totalSpent: -event.amount } }
17 );
18 break;
19 }
20}Handling Eventual Consistency
One of the most significant trade-offs in this architecture is the shift from strong consistency to eventual consistency. Since there is a delay between an event being stored and a projection being updated, a user might not see their changes immediately. This gap is usually measured in milliseconds, but the application must be designed to handle this asynchronous reality.
UI strategies such as optimistic updates or polling can mask this latency for the end user. On the backend, developers must ensure that the command side does not rely on the read model to validate business rules. The write side must contain all the state it needs to maintain its own integrity within the aggregate boundaries.
Evolutionary Architecture and Replayability
The ability to replay events is a transformative capability for long-lived software systems. In traditional systems, once data is deleted or transformed during a migration, the original state is lost forever. With an event-sourced CQRS setup, the entire history of the system is preserved, allowing you to answer questions you had not even thought of when the system was built.
When a bug is discovered in how a certain metric is calculated, you simply fix the code in the projection worker. After the fix is deployed, you reset the projection and replay the event stream from the beginning of time. The system will process every historical event again, resulting in a perfectly accurate and corrected read model without any manual data patching.
This replayability also supports experimental features and blue-green deployments for data. You can run two versions of a projection side-by-side to verify that a new implementation produces the expected results before switching traffic. This reduces the risk of deploying complex analytical changes and gives the engineering team greater confidence in their data integrity.
Versioning Events and Projections
As business requirements change, the structure of events will inevitably evolve. Handling these changes requires a strategy for event versioning, such as upcasting, where old events are transformed into the new format as they are read. This ensures that the projection logic stays clean and does not have to account for every historical variation of a schema.
Projections themselves should also be versioned to allow for seamless transitions between different data models. By including a version tag in the projection table name, you can populate a new version in the background while the old version continues to serve traffic. Once the new version is fully synchronized with the event stream, the application can switch over with zero downtime.
