Event Sourcing and CQRS: Storing State as a History of Changes


Your bank account shows a balance of 1,500 dollars. How did it get there? With a traditional database, you do not know. The balance was updated in place. The history is gone.

With event sourcing, you know exactly: deposit 2,000, withdraw 300, withdraw 200. The current balance is derived by replaying these events. The history is the source of truth.

This is not just an audit log feature. It changes how you think about state, consistency, and scalability.

What event sourcing is

Event sourcing stores the state of an entity as a sequence of immutable events. Instead of updating a row in a database, you append a new event. The current state is derived by replaying all events from the beginning.

Traditional approach:

accounts table:
id: 123, balance: 1500, updated_at: 2024-01-15

Event sourcing approach:

events table:
id: 1, account_id: 123, type: Deposited, amount: 2000, timestamp: 2024-01-10
id: 2, account_id: 123, type: Withdrawn, amount: 300, timestamp: 2024-01-12
id: 3, account_id: 123, type: Withdrawn, amount: 200, timestamp: 2024-01-15

Current balance = 2000 - 300 - 200 = 1500.

graph LR
subgraph traditional["Traditional - Mutable State"]
  CMD1["Deposit 2000"] -->|"UPDATE balance=2000"| DB1["accounts
balance: 2000"]
  CMD2["Withdraw 300"] -->|"UPDATE balance=1700"| DB1
  CMD3["Withdraw 200"] -->|"UPDATE balance=1500"| DB1
  DB1 -->|"current state only"| Q1["Query: balance?"]
  Q1 --> R1["1500
(history lost)"]
end

subgraph eventsourcing["Event Sourcing - Immutable Events"]
  CMD4["Deposit 2000"] -->|"APPEND"| ES["event store
Deposited 2000
Withdrawn 300
Withdrawn 200"]
  CMD5["Withdraw 300"] -->|"APPEND"| ES
  CMD6["Withdraw 200"] -->|"APPEND"| ES
  ES -->|"replay events"| Q2["Query: balance?"]
  Q2 --> R2["1500
(full history preserved)"]
end

style DB1 fill:#FAEEDA,stroke:#854F0B,color:#633806
style ES fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style R2 fill:#E1F5EE,stroke:#0F6E56,color:#085041

What CQRS is

CQRS (Command Query Responsibility Segregation) separates the write model (commands) from the read model (queries). Instead of one model that handles both reads and writes, you have two:

Command side: Handles writes. Validates business rules. Appends events to the event store. Optimized for consistency.

Query side: Handles reads. Builds read-optimized projections from events. Can have multiple projections for different use cases. Optimized for performance.

CQRS and event sourcing are often used together but are independent concepts. You can use CQRS without event sourcing (separate read and write databases). You can use event sourcing without CQRS (though CQRS makes event sourcing more practical).

How they work together

Write path:

  1. Client sends a command: WithdrawMoney(account_id: 123, amount: 300)
  2. Command handler loads the account’s event history
  3. Command handler validates the command (sufficient balance?)
  4. Command handler appends a MoneyWithdrawn event to the event store
  5. Event is published to subscribers

Read path:

  1. Event handler receives MoneyWithdrawn event
  2. Event handler updates the read model (a denormalized view optimized for queries)
  3. Client queries the read model: GET /accounts/123/balance
  4. Read model returns the current balance
graph TB
subgraph write["Write Side (Command)"]
  CMD["Command
WithdrawMoney"]
  HANDLER["Command Handler
Validate business rules"]
  STORE["Event Store
Append MoneyWithdrawn"]
  CMD --> HANDLER --> STORE
end

subgraph read["Read Side (Query)"]
  PROJ["Event Handler
Update read model"]
  RM["Read Model
Denormalized view
Optimized for queries"]
  Q["Query
GET /balance"]
  STORE -->|"event"| PROJ
  PROJ --> RM
  Q --> RM
end

style STORE fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style RM fill:#E1F5EE,stroke:#0F6E56,color:#085041
style HANDLER fill:#FAEEDA,stroke:#854F0B,color:#633806

Benefits of event sourcing

Complete audit log: Every change is recorded. You know who changed what, when, and why. Required for financial systems, healthcare, and compliance.

Time travel: Replay events up to any point in time to see the state at that moment. Useful for debugging (“what was the account balance at 3pm yesterday?”).

Event replay: If you add a new feature that needs historical data, replay all events to build the new read model. No data migration needed.

Temporal decoupling: Events are published asynchronously. Multiple services can react to the same events at their own pace.

Debugging: When something goes wrong, you have the complete history of what happened.

Where it breaks or gets interesting

Eventual consistency

The read model is updated asynchronously after events are written. There is a brief window where the write model has the latest state but the read model does not. This is eventual consistency.

For most use cases, this is acceptable. For cases where you need immediate consistency (show the user their updated balance right after a withdrawal), read from the event store directly or use a synchronous projection update.

Event schema evolution

Events are immutable. You cannot change a past event. But your event schema will evolve over time. How do you handle old events with the old schema?

Options: upcasting (transform old events to the new schema when loading), versioned events (keep old event types, add new ones), or event migration (rewrite old events - risky).

The safest approach: design events to be backward compatible. Add fields, never remove. Use optional fields with defaults.

Snapshots for performance

Replaying 10 years of events to get the current state is slow. Use snapshots: periodically save the current state as a snapshot. When loading an entity, load the latest snapshot and replay only events after the snapshot.

Store snapshots alongside events. Snapshot every N events or every N hours.

The event store

The event store is the source of truth. It must be:

  • Append-only - Events are never modified or deleted
  • Ordered - Events for an entity are in order
  • Durable - Events survive crashes
  • Queryable - Load all events for an entity efficiently

Options: EventStoreDB (purpose-built), PostgreSQL with an events table, Kafka (with long retention), DynamoDB.

CQRS complexity

CQRS adds complexity: two models to maintain, eventual consistency to handle, multiple projections to build and keep in sync. This complexity is justified for complex domains with many read patterns. For simple CRUD applications, CQRS is overkill.

Real-world systems

EventStoreDB - Purpose-built event store. Supports event streams, subscriptions, and projections. The reference implementation for event sourcing.

Axon Framework - Java framework for event sourcing and CQRS. Handles event storage, command routing, and projection building.

Temporal - Workflow engine that uses event sourcing internally. Workflow state is stored as a sequence of events.

Git - A version control system is essentially event sourcing for code. Each commit is an immutable event. The current state of the repository is derived by replaying commits.

Kafka - Used as an event store for event sourcing. Long retention periods allow replaying historical events. Multiple consumer groups build different projections.

Stripe - Uses event sourcing for payment processing. Every payment state change is an immutable event. The audit log is the source of truth.

How to apply it in practice

When to use event sourcing

Use event sourcing when:

  • You need a complete audit log (financial systems, healthcare, compliance)
  • You need to replay events to rebuild state or build new projections
  • Your domain has complex business rules that benefit from explicit event modeling
  • You need temporal queries (“what was the state at time T?”)

Do not use event sourcing when:

  • Simple CRUD application with no audit requirements
  • Your team is not familiar with event-driven design
  • You need strong consistency between reads and writes
  • The added complexity is not justified by the benefits

Designing events

Good events:

  • Past tense - MoneyWithdrawn, OrderShipped, UserRegistered
  • Immutable - Never modify a published event
  • Self-contained - Include all data needed to understand what happened
  • Business-meaningful - Represent domain events, not technical operations

Bad events:

  • UserUpdated (too vague - what was updated?)
  • DatabaseRowChanged (technical, not business-meaningful)
  • SetBalance (command, not event)

Building projections

A projection is a read model built from events. Build projections for each query pattern:

  • Account balance projection: sum of deposits minus withdrawals
  • Transaction history projection: list of events in reverse chronological order
  • Monthly statement projection: events grouped by month

Projections can be rebuilt at any time by replaying events. This means you can add new projections without data migration.

FAQ

Q: How do you delete data in an event-sourced system (GDPR right to erasure)?

This is a real challenge. Events are immutable, but GDPR requires the ability to delete personal data. Options: crypto-shredding (encrypt personal data with a per-user key, delete the key to make the data unreadable), event tombstoning (append a “data deleted” event and filter it out when replaying), or store personal data separately (events contain a reference ID, personal data is in a separate deletable store). Crypto-shredding is the most common approach.

Q: What is the difference between event sourcing and an audit log?

An audit log is a secondary record of changes, added alongside the primary mutable state. Event sourcing makes the event log the primary source of truth. With an audit log, the database is the source of truth and the log is a copy. With event sourcing, the event store is the source of truth and the current state is derived from it. Event sourcing is more powerful (you can rebuild any state from events) but more complex.

Q: How do you handle concurrent writes in an event-sourced system?

Use optimistic concurrency control. When loading an entity, record the version (the number of events). When appending a new event, include the expected version. If the actual version does not match (another write happened concurrently), reject the write and retry. This is similar to optimistic locking in a traditional database. EventStoreDB and most event store implementations support this natively.

Interview questions

Q1: Design an event-sourced bank account. What events would you define and how would you derive the current balance?

Strong answer: Events: AccountOpened(account_id, owner_id, initial_balance), MoneyDeposited(account_id, amount, reference), MoneyWithdrawn(account_id, amount, reference), AccountFrozen(account_id, reason), AccountClosed(account_id). To derive current balance: load all events for the account, apply them in order. AccountOpened sets the initial balance. MoneyDeposited adds to the balance. MoneyWithdrawn subtracts. AccountFrozen does not change the balance but affects whether withdrawals are allowed. For performance, take a snapshot every 100 events. When loading, start from the latest snapshot and replay only subsequent events. For the read model (CQRS): maintain a account_balances table updated by an event handler. Queries read from this table. The event store is the source of truth; the read model is a derived view.

Q2: You are using event sourcing. A bug in your code caused incorrect events to be written for the last 2 hours. How do you fix this?

Strong answer: Events are immutable - you cannot delete or modify them. Options: append correcting events (if the domain supports it - e.g., BalanceCorrected event), rebuild the read model from scratch ignoring the buggy events (if you can identify them by timestamp or a flag), or use event versioning to mark the buggy events as invalid and skip them during replay. The cleanest approach: append correcting events that explicitly undo the incorrect state. This preserves the full history (including the bug and the correction) which is valuable for auditing. Rebuild all affected projections from the corrected event stream. For the future: add validation in the command handler to prevent invalid events from being written.

Q3: How does CQRS help with scalability?

Strong answer: CQRS separates the write model (optimized for consistency and business rules) from the read model (optimized for query performance). This allows independent scaling. The write side handles commands sequentially (or with optimistic concurrency) - it does not need to scale as aggressively because writes are less frequent than reads. The read side can be scaled horizontally: multiple read replicas, multiple projections optimized for different query patterns, caching at the read model level. You can have a PostgreSQL write store (ACID, consistent) and an Elasticsearch read model (fast full-text search) and a Redis read model (fast key-value lookups) - all built from the same event stream. Each read model is optimized for its specific query pattern. This is much more flexible than trying to optimize one database for all access patterns.