Lamport and Vector Clocks: Ordering Events Without a Global Clock


Two users edit the same document simultaneously. User A’s change arrives at server 1 at 10:00:01. User B’s change arrives at server 2 at 10:00:01. Same timestamp. Which change happened first? You cannot tell from wall clock time alone - the clocks on the two servers might be slightly out of sync.

In a distributed system, there is no global clock. Physical clocks drift. Network delays are unpredictable. You cannot rely on timestamps to determine the order of events. Lamport clocks and vector clocks solve this problem.

The problem: no global time

In a single-process system, events have a natural order: the order they execute. In a distributed system, events on different nodes happen concurrently. There is no shared clock that all nodes agree on.

Physical clocks (NTP-synchronized) are close but not perfect. Two events with the same timestamp might have happened in any order. Two events with different timestamps might have happened in the opposite order if the clocks are slightly out of sync.

What we need: a way to determine the causal order of events. If event A caused event B (A happened before B), we need to know that. If A and B are concurrent (neither caused the other), we need to know that too.

Lamport timestamps

Leslie Lamport (1978) defined the “happened-before” relation and a logical clock to capture it.

The happened-before relation (->):

  • If A and B are events in the same process and A occurs before B, then A -> B
  • If A is the sending of a message and B is the receipt of that message, then A -> B
  • If A -> B and B -> C, then A -> C (transitivity)

If neither A -> B nor B -> A, then A and B are concurrent.

Lamport clock rules:

  1. Each process maintains a counter, initially 0
  2. Before each event, increment the counter
  3. When sending a message, include the current counter value
  4. When receiving a message, set the counter to max(local, received) + 1

Property: If A -> B, then timestamp(A) < timestamp(B). But the converse is not true: timestamp(A) < timestamp(B) does not mean A -> B. Lamport timestamps give a total order but cannot distinguish causality from coincidence.

graph LR
subgraph lamport["Lamport Timestamps"]
  subgraph P1["Process 1"]
    E1["Event A
L=1"]
    E2["Send msg
L=2"]
    E3["Event C
L=4"]
  end
  subgraph P2["Process 2"]
    E4["Receive msg
L=max(0,2)+1=3"]
    E5["Event D
L=4"]
  end
  E1 --> E2
  E2 -->|"msg with L=2"| E4
  E4 --> E5
  E2 --> E3
end

style E1 fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style E2 fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style E3 fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style E4 fill:#E1F5EE,stroke:#0F6E56,color:#085041
style E5 fill:#E1F5EE,stroke:#0F6E56,color:#085041

Vector clocks

Vector clocks (Fidge, Mattern, 1988) extend Lamport clocks to detect concurrent events.

Instead of a single counter, each process maintains a vector of counters - one per process. V[i] is the number of events process i has had that process j knows about.

Vector clock rules:

  1. Each process i maintains a vector V where V[i] starts at 0
  2. Before each event, increment V[i]
  3. When sending a message, include the current vector V
  4. When receiving a message with vector W, set V[j] = max(V[j], W[j]) for all j, then increment V[i]

Comparing vector clocks:

  • V1 < V2 (V1 happened before V2) if V1[i] <= V2[i] for all i, and V1[j] < V2[j] for some j
  • V1 and V2 are concurrent if neither V1 < V2 nor V2 < V1
graph LR
subgraph vector["Vector Clocks - 3 Processes"]
  subgraph PA["Process A"]
    A1["Event
[1,0,0]"]
    A2["Send to B
[2,0,0]"]
    A3["Receive from B
[3,2,0]"]
  end
  subgraph PB["Process B"]
    B1["Receive from A
[2,1,0]"]
    B2["Event
[2,2,0]"]
    B3["Send to A
[2,3,0]"]
  end
  subgraph PC["Process C"]
    C1["Event
[0,0,1]"]
    C2["Event
[0,0,2]"]
  end
  A1 --> A2
  A2 -->|"[2,0,0]"| B1
  B1 --> B2 --> B3
  B3 -->|"[2,3,0]"| A3
end

style A1 fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style A2 fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style A3 fill:#EEEDFE,stroke:#534AB7,color:#3C3489
style B1 fill:#E1F5EE,stroke:#0F6E56,color:#085041
style B2 fill:#E1F5EE,stroke:#0F6E56,color:#085041
style B3 fill:#E1F5EE,stroke:#0F6E56,color:#085041
style C1 fill:#FAEEDA,stroke:#854F0B,color:#633806
style C2 fill:#FAEEDA,stroke:#854F0B,color:#633806

In this example, C1 [0,0,1] and A2 [2,0,0] are concurrent - neither happened before the other. C1 and B2 [2,2,0] are also concurrent. But A2 [2,0,0] happened before A3 [3,2,0] (A3 has higher values in all positions).

Where it breaks or gets interesting

Vector clock size

Vector clocks have one entry per process. With N processes, each clock is O(N). For large distributed systems with thousands of nodes, this is impractical.

Solutions: dotted version vectors (used by Riak), interval tree clocks, or hybrid logical clocks (HLC).

Hybrid Logical Clocks (HLC)

HLC combines physical time with logical time. The clock value is (physical_time, logical_counter). It advances with physical time when possible, and uses the logical counter to break ties.

Benefits: clock values are close to physical time (useful for debugging), but still capture causality. Used by CockroachDB and YugabyteDB.

Last-write-wins with Lamport timestamps

Cassandra uses Lamport timestamps for conflict resolution. When two writes conflict, the one with the higher timestamp wins. This is simple but can lose data: if two concurrent writes have the same timestamp (or if clocks are slightly out of sync), one write is silently discarded.

Version vectors in distributed databases

Riak and DynamoDB use version vectors (similar to vector clocks) to detect conflicts. When two replicas have conflicting versions, the system surfaces the conflict to the application for resolution. This is safer than last-write-wins but requires the application to handle conflicts.

Real-world systems

Amazon DynamoDB - Uses version vectors internally for conflict detection in multi-region replication.

Riak - Uses dotted version vectors for conflict detection. Surfaces conflicts to the application.

CockroachDB - Uses Hybrid Logical Clocks (HLC) for transaction ordering. Combines physical time with logical counters.

Google Spanner - Uses TrueTime (GPS + atomic clocks) to bound clock uncertainty. Achieves external consistency without vector clocks.

Cassandra - Uses Lamport timestamps (wall clock time) for last-write-wins conflict resolution. Simple but can lose data on clock skew.

Git - Uses a DAG (directed acyclic graph) of commits to represent causality. A merge commit has two parents, representing concurrent changes.

How to apply it in practice

When you need logical clocks

Use logical clocks when:

  • You need to determine the causal order of events across distributed nodes
  • You need to detect concurrent events (for conflict resolution)
  • You cannot rely on physical clock synchronization

Use physical clocks when:

  • You need to correlate events with real time (for debugging, auditing)
  • Clock skew is bounded and acceptable (NTP-synchronized clocks within 1ms)
  • You are using a database that handles ordering for you (Spanner, CockroachDB)

Practical conflict resolution

When vector clocks detect concurrent writes, you need a conflict resolution strategy:

  • Last-write-wins - Simple, loses data. Use when data loss is acceptable.
  • Application-level merge - Surface the conflict to the application. Use when data loss is not acceptable.
  • CRDTs - Data structures that merge automatically. Use for counters, sets, and other commutative operations.

Debugging with logical clocks

Include logical timestamps in log messages. When debugging a distributed system, you can sort log messages by logical timestamp to reconstruct the causal order of events, even if physical timestamps are slightly out of sync.

FAQ

Q: What is the difference between Lamport clocks and vector clocks?

Lamport clocks give a total order: if A -> B, then timestamp(A) < timestamp(B). But you cannot tell from timestamps alone whether A -> B or A and B are concurrent. Vector clocks give a partial order: you can determine whether A -> B, B -> A, or A and B are concurrent. Vector clocks are more powerful but more expensive (O(N) size vs O(1) for Lamport).

Q: Why does Google Spanner not need vector clocks?

Spanner uses TrueTime, which provides a bounded uncertainty interval for physical time. Every timestamp is [earliest, latest]. Spanner waits until the uncertainty interval has passed before committing a transaction. This ensures that if transaction A commits before transaction B starts, A’s commit timestamp is strictly less than B’s. This achieves external consistency without vector clocks. The tradeoff: Spanner requires GPS receivers and atomic clocks in every data center to keep clock uncertainty small (typically 1-7ms).

Q: How do CRDTs relate to vector clocks?

CRDTs (Conflict-free Replicated Data Types) are data structures designed to merge automatically without conflicts. Many CRDTs use vector clocks internally to track causality. For example, a grow-only counter (G-Counter) uses a vector where each entry is the count from one node. The total count is the sum of all entries. Merging two G-Counters takes the max of each entry. This is commutative and associative - it always produces the same result regardless of merge order.

Interview questions

Q1: Two users update the same document simultaneously on different servers. How do you detect and resolve the conflict?

Strong answer: Use vector clocks to detect the conflict. Each document version has a vector clock. When user A reads the document, they get version V1 with clock [1,0]. When user B reads the document, they also get V1 with clock [1,0]. User A writes, creating V2 with clock [2,0]. User B writes, creating V3 with clock [1,1]. When the servers sync, they compare clocks: [2,0] and [1,1] are concurrent (neither is greater in all positions). This is a conflict. Resolution options: last-write-wins (discard one version - simple but loses data), merge (combine both changes - requires application logic), or surface the conflict to the user (show both versions and ask them to resolve). For a collaborative document editor, use Operational Transformation or CRDTs to merge concurrent edits automatically.

Q2: Explain why physical timestamps are insufficient for ordering events in a distributed system.

Strong answer: Physical clocks on different machines drift and are never perfectly synchronized. NTP synchronization keeps clocks within 1-100ms of each other, but that is not precise enough for many distributed systems. If event A on server 1 has timestamp 10:00:00.001 and event B on server 2 has timestamp 10:00:00.000, you might conclude A happened after B. But if server 1’s clock is 5ms ahead of server 2’s clock, A actually happened before B. Additionally, physical timestamps do not capture causality: if A caused B (A sent a message that B received), physical timestamps might not reflect this if the message was delayed. Logical clocks (Lamport, vector) capture causality explicitly, independent of physical time.

Q3: How does CockroachDB use Hybrid Logical Clocks to achieve serializable transactions?

Strong answer: CockroachDB uses HLC to assign timestamps to transactions. Each node maintains an HLC that advances with physical time and uses a logical counter to break ties. When a transaction starts, it gets an HLC timestamp. When it commits, it checks that no other transaction with a higher timestamp has already committed to the same data (using MVCC - multi-version concurrency control). If there is a conflict, the transaction retries with a higher timestamp. The HLC ensures that if transaction A commits before transaction B starts (in real time), A’s timestamp is strictly less than B’s. This achieves serializable isolation without a global coordinator. The bounded clock uncertainty (typically 500ms) means transactions might need to wait briefly to ensure their timestamp is in the past before committing, but this is much cheaper than a global lock.