Consistency Models: The Spectrum Between Perfect and Fast


You post a photo on Instagram. You refresh your profile. The photo is gone. You refresh again. It is back. You show your friend on their phone. They don’t see it yet. Ten seconds later, everyone sees it.

That experience - where different observers see different states of the world at the same time - is not a bug. It is a deliberate consistency model choice. Instagram’s feed is eventually consistent. Your bank account is not. Understanding why, and what the options are between those two extremes, is what consistency models are about.

The core problem

In a distributed system, data lives on multiple nodes. When you write to one node, the other nodes need to be updated. The question is: what guarantees do you make about what readers see during and after that update process?

The answer is your consistency model. It is a contract between the system and its users about the ordering and visibility of operations.

The consistency spectrum

From strongest to weakest:

graph LR
LI["Linearizability<br/>(Strict)"] --> SE["Sequential<br/>Consistency"] --> CA["Causal<br/>Consistency"] --> RE["Read Your<br/>Writes"] --> MO["Monotonic<br/>Reads"] --> EV["Eventual<br/>Consistency"]

style LI fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style SE fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style CA fill:#F1EFE8,stroke:#888780,color:#444441
style RE fill:#F1EFE8,stroke:#888780,color:#444441
style MO fill:#E1F5EE,stroke:#0F6E56,color:#085041
style EV fill:#E1F5EE,stroke:#0F6E56,color:#085041

Linearizability (strict consistency)

The strongest model. Every operation appears to take effect instantaneously at some point between its invocation and completion. All observers see operations in the same order, and that order is consistent with real time.

If you write X=1 and the write completes, any subsequent read from any node must return 1. There is no window where a read can return the old value after the write has completed.

This is what you get from a single-node database. In a distributed system, achieving it requires coordination - typically a consensus protocol like Raft or Paxos. The cost is latency: every write must be confirmed by a quorum before it is visible.

Systems: etcd, ZooKeeper, Google Spanner (within a region), CockroachDB

Sequential consistency

Weaker than linearizability. All operations appear to execute in some sequential order, and each process’s operations appear in that sequence in the order they were issued. But the sequence doesn’t have to match real time.

If process A writes X=1 and then X=2, all observers will see X=1 before X=2. But a write from process A and a concurrent write from process B might be seen in different orders by different observers - as long as each observer sees a consistent sequence.

Systems: Some multi-processor memory models, older distributed databases

Causal consistency

Weaker still. Operations that are causally related must be seen in causal order by all nodes. Concurrent operations (with no causal relationship) can be seen in any order.

If you post a comment and then reply to it, all observers must see the original comment before the reply. But two unrelated posts from different users can appear in any order.

This is a practical sweet spot for many social applications. It preserves the “makes sense” ordering without requiring global coordination.

Systems: MongoDB (causal sessions), COPS, Bolt-on causal consistency

Read-your-writes (read-your-own-writes)

A session guarantee. After you write a value, you will always see that value or a newer one in subsequent reads - within the same session. Other users might still see the old value.

This is the minimum you need to avoid the “I just posted something and now it’s gone” experience. It is weaker than causal consistency because it only applies to your own writes, not to writes you’ve observed.

Systems: Most databases with sticky sessions or session tokens

Monotonic reads

Once you’ve read a value, you will never read an older value. Reads are non-decreasing in terms of the data version you see.

Without this, you can see a post, refresh, and see it disappear - because your second read hit a replica that hadn’t received the write yet. Monotonic reads prevents this.

Systems: Achieved by routing reads from the same client to the same replica

Eventual consistency

The weakest model. Given no new writes, all replicas will eventually converge to the same value. No guarantees about when, and no guarantees about what you see in the meantime.

In practice, “eventually” is usually milliseconds to seconds under normal conditions. But under partition or heavy load, it can be much longer.

Systems: Cassandra (default), DynamoDB (default reads), DNS

graph TB
subgraph linear["Linearizable Write"]
  LC1["Client 1"] -->|"Write X=1"| LN1["Node 1"]
  LN1 -->|"Sync replicate"| LN2["Node 2"]
  LN2 -->|"ACK"| LN1
  LN1 -->|"Write complete"| LC1
  LC2["Client 2"] -->|"Read X"| LN2
  LN2 -->|"X=1 guaranteed"| LC2
end

subgraph eventual["Eventually Consistent Write"]
  EC1["Client 1"] -->|"Write X=1"| EN1["Node 1"]
  EN1 -->|"Write complete"| EC1
  EC2["Client 2"] -->|"Read X immediately"| EN2["Node 2 stale"]
  EN2 -->|"X=0 stale"| EC2
  EN1 -.->|"Async replicate later"| EN2
  EC2b["Client 2 later"] -->|"Read X"| EN2b["Node 2 synced"]
  EN2b -->|"X=1 converged"| EC2b
end

style LN1 fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style LN2 fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style EN1 fill:#E1F5EE,stroke:#0F6E56,color:#085041
style EN2 fill:#FAEEDA,stroke:#854F0B,color:#633806
style EN2b fill:#E1F5EE,stroke:#0F6E56,color:#085041

Where it breaks or gets interesting

The “C” in CAP is linearizability, not ACID consistency

This is one of the most common confusions in distributed systems. CAP’s consistency means linearizability - a specific, strong model. ACID’s consistency means the database maintains its invariants (constraints, foreign keys). They are completely different things. A database can be ACID consistent but not linearizable.

Eventual consistency doesn’t mean “eventually correct”

If your conflict resolution is wrong, replicas can converge to an incorrect value. Last-write-wins (LWW) with unsynchronized clocks can silently discard writes. A write from a node with a slightly fast clock will always win, even if it was logically earlier.

The “anomaly zoo”

Different consistency models allow different anomalies:

  • Dirty read - Reading uncommitted data from another transaction
  • Non-repeatable read - The same row returns different values within one transaction
  • Phantom read - A query returns different rows within one transaction because another transaction inserted/deleted rows
  • Write skew - Two transactions read overlapping data, make decisions, write to non-overlapping data, violating a constraint
  • Lost update - Two transactions read a value, both modify it, one overwrites the other’s change

Stronger consistency models prevent more anomalies. Weaker models allow more anomalies in exchange for performance.

Consistency is per-operation, not per-system

DynamoDB lets you choose per read: eventually consistent (cheaper, faster) or strongly consistent (more expensive, slower). Cassandra lets you set consistency level per operation. A single system can offer multiple consistency models depending on what you need for each operation.

Real-world systems and their models

Linearizable:

  • etcd - Uses Raft. Every read goes through the leader. Guarantees linearizability for all operations.
  • ZooKeeper - Sequential consistency for writes, linearizable reads with sync() call.
  • Google Spanner - External consistency (stronger than linearizability) using TrueTime.

Causal:

  • MongoDB causal sessions - Within a session, reads reflect all previous writes in that session.
  • Amazon DynamoDB global tables - Eventual by default, but you can use condition expressions to enforce causal ordering.

Eventual:

  • Cassandra - Default consistency is ONE (one replica must acknowledge). Eventual across replicas.
  • DynamoDB - Default reads are eventually consistent. Strongly consistent reads available.
  • DNS - TTL-based caching. Changes propagate over minutes to hours.
  • CDN edge caches - Content is eventually consistent with origin. Cache invalidation is the hard part.

How to apply it in practice

The right consistency model depends on what anomalies your application can tolerate.

Questions to ask:

  1. Can two users see different states simultaneously? If yes, eventual consistency is viable. If no (e.g., a shared document), you need at least causal consistency.

  2. Can a user see their own write disappear? If no, you need read-your-writes at minimum.

  3. Can a user see events out of order? If no (e.g., a chat thread), you need causal consistency.

  4. Can two concurrent operations produce an incorrect combined result? If no (e.g., two users both spending from the same account), you need linearizability or serializable transactions.

graph TB
subgraph strong["Strong Consistency Needed"]
  FIN["Financial transactions"]
  INV["Inventory management"]
  LOCK["Distributed locks"]
  COORD["Leader election"]
end

subgraph causal["Causal Consistency Sufficient"]
  CHAT["Chat messages"]
  COMM["Comment threads"]
  FEED["Social feeds"]
  COLLAB["Collaborative docs"]
end

subgraph eventual["Eventual Consistency Sufficient"]
  DNS2["DNS records"]
  CDN2["CDN content"]
  ANAL["Analytics counters"]
  SEARCH["Search indexes"]
end

style strong fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style causal fill:#F1EFE8,stroke:#888780,color:#444441
style eventual fill:#E1F5EE,stroke:#0F6E56,color:#085041

Practical patterns

Session tokens for read-your-writes: After a write, return a token representing the write’s position in the replication log. On subsequent reads, pass the token. The read replica checks if it has caught up to that position before serving the read. If not, it either waits or routes to a replica that has. This is how MongoDB implements causal consistency.

Fencing tokens for distributed locks: When a lock is acquired, issue a monotonically increasing token. Any operation using the lock must include the token. If a stale lock holder tries to write with an old token, the storage system rejects it. This prevents split-brain scenarios where two nodes both think they hold the lock.

FAQ

Q: Is linearizability the same as serializability?

No, and this trips up a lot of people. Serializability is a property of transactions - it means concurrent transactions produce results equivalent to some serial execution. Linearizability is a property of individual operations - it means each operation appears to take effect at a single point in time. Strict serializability combines both: transactions are serializable AND the serial order is consistent with real time. This is what “ACID with Serializable isolation” gives you on a single node.

Q: How does eventual consistency work in practice for a shopping cart?

Amazon’s Dynamo paper (2007) describes exactly this. The shopping cart is modeled as a set with add and remove operations. During a partition, both sides can accept adds and removes. When the partition heals, the two versions are merged by taking the union of adds and removes. This means items can reappear after being removed (if the remove happened on one side and an add happened on the other). Amazon decided this was acceptable - a cart with an extra item is better than a cart that rejects operations. The customer can remove the item again.

Q: Can you mix consistency models in one application?

Yes, and you should. Use strong consistency for operations where correctness is critical (payments, inventory). Use eventual consistency for operations where performance matters more than perfect accuracy (view counts, recommendation scores, activity feeds). The key is being explicit about which model each piece of data uses, and designing the application to handle the anomalies that each model allows.

Interview questions

Q1: You’re building a collaborative document editor (like Google Docs). What consistency model do you need and why?

Strong answer: You need at minimum causal consistency. If user A types “Hello” and then user B types ” World” in response, all observers must see “Hello” before ” World”. Pure eventual consistency would allow observers to see ” World” before “Hello”, which is confusing. However, you don’t need linearizability - it’s acceptable for two users to see slightly different states of the document momentarily, as long as they converge. The practical implementation uses Operational Transformation (OT) or CRDTs to merge concurrent edits. Google Docs uses OT. CRDTs (like the YATA algorithm used by Yjs) are increasingly popular because they don’t require a central server to coordinate.

Q2: Your distributed cache is eventually consistent. A user updates their profile picture. They immediately refresh and see the old picture. How do you fix this without making the cache strongly consistent?

Strong answer: Implement read-your-writes using a write token. When the profile picture update is saved, return a version token (could be a timestamp or a sequence number). Store this token in the user’s session. On the next profile read, pass the token to the cache. The cache checks if it has a version at least as recent as the token. If yes, serve from cache. If no, bypass the cache and read from the primary database, then update the cache. This gives the user read-your-writes consistency without requiring all reads to be strongly consistent. The overhead is only for the user who just made the write, not for all users reading that profile.

Q3: Explain the difference between a linearizable system and a sequentially consistent system with a concrete example.

Strong answer: Consider two clients, A and B, and two variables X and Y, both starting at 0. Client A writes X=1 then Y=1. Client B reads Y then X. In a linearizable system, if B reads Y=1, it must read X=1 - because Y=1 was written after X=1, so if B sees Y=1, it must have observed a point in time after both writes. In a sequentially consistent system, B could read Y=1 and then X=0. The operations are in a valid sequential order (A’s writes are in order, B’s reads are in order), but the order doesn’t have to match real time. This matters for things like distributed locks: if you release a lock by writing a flag, a sequentially consistent system might let another client acquire the lock before seeing the flag update, even though the flag was written first in real time.