How does Google Docs handle multiple users editing at the same time?

Google Docs uses Operational Transform (OT) to handle concurrent edits. Each keystroke is converted into an operation with a position and content. When two users edit simultaneously, the server receives both operations, transforms them to account for each other's changes, and broadcasts the transformed operations back. This ensures all users see the same final document without losing any edits.

What is Operational Transform (OT)?

Operational Transform is an algorithm that resolves conflicts in collaborative editing. It transforms operations based on other concurrent operations. For example, if Alice inserts at position 5 and Bob deletes at position 3, Bob's operation shifts Alice's insert position to 4. The transformation ensures both edits are applied correctly regardless of the order they arrive.

What is the difference between OT and CRDT for collaborative editing?

OT (Operational Transform) requires a central server to order and transform operations. It's simpler to implement but has a single point of coordination. CRDT (Conflict-Free Replicated Data Types) allows each client to apply edits independently and merge them later without a central coordinator. CRDTs work better for offline-first applications but use more memory. Google Docs uses OT while tools like Figma use CRDTs.

How does Google Docs sync changes in real-time?

Google Docs maintains persistent WebSocket connections between each user's browser and Google's servers. When you type, your browser sends the operation to the server within milliseconds. The server applies the operation, transforms it against any concurrent edits, and broadcasts it to all other connected users. The entire round trip typically takes 50-200 milliseconds.

How does Google Docs handle offline editing?

Google Docs stores a local copy of the document and queues your edits when offline. Operations are saved with timestamps. When you reconnect, the client sends all queued operations to the server. The server applies them using OT to merge your offline changes with any edits others made while you were disconnected.

How does cursor position syncing work in Google Docs?

Each user's cursor position is tracked as an index in the document. When you move your cursor or select text, your browser sends the position to the server, which broadcasts it to other users. When edits happen before your cursor position, the server transforms your cursor position just like it transforms text operations, keeping the cursor in the right place relative to the text.

How does Google Docs store version history?

Google Docs stores document versions using a combination of snapshots and operation logs. Periodically, the system saves a complete snapshot of the document. Between snapshots, it logs all operations. To restore any version, the system loads the nearest snapshot and replays operations. This is more storage-efficient than saving full copies of every version.

What database does Google Docs use?

Google Docs uses Google's internal distributed storage systems. Documents are stored in Spanner (Google's globally distributed database) or Bigtable for high availability. The actual document structure is stored in a format optimized for operational transform, with indexes for fast access to any position in the document.

How Google Docs Works Behind the Scenes

Open a Google Doc. Share it with five colleagues. Start typing. Within milliseconds, everyone sees your cursor moving and characters appearing. Five people editing the same paragraph at the same time, and nothing breaks.

This is one of the most impressive pieces of engineering on the web. Google Docs serves hundreds of millions of users, handling billions of keystrokes every day, and somehow keeps everyone’s document in perfect sync.

How does it work? That’s what we’re going to break down.

The Core Problem: Concurrent Editing

Collaborative editing sounds simple until you try to build it. Here’s why it’s hard.

Imagine two users, Alice and Bob, editing the same document at the same time:

Document starts with: “Hello”
Alice types “X” at position 5 (after “Hello”)
Bob types “Y” at position 0 (before “Hello”)
Alice’s edit: “HelloX”
Bob’s edit: “YHello”

If both edits are applied naively, what’s the final document? It depends on the order:

Apply Alice first, then Bob: “YHelloX”
Apply Bob first, then Alice: “YHelloX”

Okay, that worked out. But what about this:

Document starts with: “Hello”
Alice deletes the “H” at position 0
Bob inserts “X” at position 2 (after “He”)

If we apply Alice’s delete first, the document becomes “ello”. Now Bob’s position 2 refers to a different location than he intended. His “X” ends up in the wrong place.

This is the fundamental challenge of collaborative editing: operations are defined relative to a document state, but that state keeps changing as other users make edits.

Operational Transform: The Solution

Google Docs solves this with an algorithm called Operational Transform (OT). The core idea is simple: when you receive an operation that was created against an older version of the document, you transform it to work correctly against the current version.

How OT Works

Every edit becomes an operation with three pieces of information:

Type: Insert or Delete
Position: Where in the document
Content: What to insert or delete

flowchart LR
    subgraph User Input
        K[Keystroke]
    end
    
    subgraph Operation
        O["type: insert<br/>position: 5<br/>content: 'X'"]
    end
    
    K --> O
    
    style K fill:#e0f2fe,stroke:#0284c7,stroke-width:2px
    style O fill:#fef3c7,stroke:#d97706,stroke-width:2px

When two operations conflict, OT transforms one of them. Here’s the transform logic for a simple case:

Scenario: Alice inserts at position 5. Bob inserts at position 3. Both operations were created against the same document version.

Transform:

Alice’s operation stays at position 5
Bob’s operation stays at position 3

But wait, if Bob’s insert happens first, Alice’s position 5 is now actually position 6 (because Bob added a character before it). So we transform Alice’s operation:

Alice’s transformed operation: insert at position 6

sequenceDiagram
    participant Alice
    participant Server
    participant Bob
    
    Note over Alice,Bob: Document: "Hello"
    
    Alice->>Server: Insert "X" at position 5
    Bob->>Server: Insert "Y" at position 3
    
    Note over Server: Transform operations
    Note over Server: Alice: pos 5 → 6 (Bob added before)
    Note over Server: Bob: pos 3 (unchanged)
    
    Server->>Alice: Apply Bob's "Y" at 3
    Server->>Bob: Apply Alice's "X" at 6
    
    Note over Alice,Bob: Result: "HelYloX"

Transform Rules

The transformation rules depend on the operation types:

Op 1	Op 2	Transform Rule
Insert at X	Insert at Y	If X <= Y, shift Y by 1
Insert at X	Delete at Y	If X <= Y, shift Y by 1
Delete at X	Insert at Y	If X < Y, shift Y by -1
Delete at X	Delete at Y	If X < Y, shift Y by -1

These rules ensure that no matter what order operations arrive, the final document state is the same for all users.

The Server Is the Boss

In Google Docs, there’s always a central server that decides the canonical order of operations. Here’s the flow:

flowchart TD
    subgraph Clients
        A[Alice's Browser]
        B[Bob's Browser]
        C[Carol's Browser]
    end
    
    subgraph Google
        WS[WebSocket Servers]
        CS[Collaboration Server]
        DB[(Document Storage)]
    end
    
    A -->|Operations| WS
    B -->|Operations| WS
    C -->|Operations| WS
    
    WS --> CS
    CS --> DB
    CS -->|Transformed Ops| WS
    
    WS -->|Broadcast| A
    WS -->|Broadcast| B
    WS -->|Broadcast| C
    
    style CS fill:#dcfce7,stroke:#16a34a,stroke-width:2px
    style DB fill:#fef3c7,stroke:#d97706,stroke-width:2px
    style WS fill:#e0f2fe,stroke:#0284c7,stroke-width:2px

Client sends operation to server
Server assigns a global sequence number
Server transforms the operation against any operations it has seen since the client’s last known state
Server applies the operation to its copy of the document
Server broadcasts the transformed operation to all other clients
Clients transform and apply the operation locally

The server’s version is always the source of truth.

System Architecture

Let’s zoom out and look at the complete system design.

High Level Architecture

flowchart TB
    subgraph Clients
        C1[Web Browser]
        C2[Mobile App]
        C3[Desktop App]
    end
    
    subgraph Edge
        LB[Load Balancer]
        CDN[CDN - Static Assets]
    end
    
    subgraph Real-time Layer
        WS1[WebSocket Server 1]
        WS2[WebSocket Server 2]
        WSN[WebSocket Server N]
    end
    
    subgraph Application Layer
        CS[Collaboration Service]
        DS[Document Service]
        US[User Service]
    end
    
    subgraph Data Layer
        MQ[Message Queue]
        CACHE[(Redis Cache)]
        DB[(Document DB)]
        SEARCH[(Search Index)]
    end
    
    C1 --> LB
    C2 --> LB
    C3 --> LB
    C1 --> CDN
    
    LB --> WS1
    LB --> WS2
    LB --> WSN
    
    WS1 --> CS
    WS2 --> CS
    WSN --> CS
    
    CS --> MQ
    CS --> DS
    DS --> CACHE
    DS --> DB
    DS --> SEARCH
    
    style LB fill:#e0f2fe,stroke:#0284c7,stroke-width:2px
    style CS fill:#dcfce7,stroke:#16a34a,stroke-width:2px
    style DB fill:#fef3c7,stroke:#d97706,stroke-width:2px
    style MQ fill:#fee2e2,stroke:#dc2626,stroke-width:2px

Real-Time Connection Layer

Google Docs maintains WebSocket connections with every active user. This is critical because:

Low latency: No HTTP overhead for each keystroke
Server push: Server can send updates immediately without client polling
Bidirectional: Both client and server can initiate messages

Each WebSocket server handles tens of thousands of connections. When a user opens a document, they’re assigned to a WebSocket server through the load balancer. That connection stays open for the entire editing session.

# Simplified WebSocket connection handling
class DocumentConnection:
    def __init__(self, user_id, doc_id, websocket):
        self.user_id = user_id
        self.doc_id = doc_id
        self.ws = websocket
        self.last_version = 0  # Last operation version client has seen
    
    async def handle_message(self, message):
        if message.type == 'operation':
            await self.process_operation(message.op)
        elif message.type == 'cursor':
            await self.broadcast_cursor(message.position)
    
    async def process_operation(self, op):
        # Send to collaboration service for OT processing
        transformed_op = await collaboration_service.apply(
            self.doc_id, 
            op, 
            self.last_version
        )
        # Update our version
        self.last_version = transformed_op.version

The Collaboration Service

This is where the OT magic happens. The collaboration service:

Receives operations from WebSocket servers
Maintains in-memory state of active documents
Applies OT transformations
Persists operations to storage
Broadcasts transformed operations to all connected clients

The collaboration service needs to be fast. Really fast. Every keystroke from every user goes through it. Google likely uses:

In-memory data structures for active documents
Sharding by document ID so each document lives on one server
Leader election so only one server handles a given document

flowchart LR
    subgraph Sharding
        S1[Shard 1<br/>Docs A-M]
        S2[Shard 2<br/>Docs N-Z]
    end
    
    D1[Doc Alpha] --> S1
    D2[Doc Beta] --> S1
    D3[Doc Notes] --> S2
    D4[Doc Report] --> S2
    
    style S1 fill:#e0f2fe,stroke:#0284c7,stroke-width:2px
    style S2 fill:#dcfce7,stroke:#16a34a,stroke-width:2px

Document Storage

Documents aren’t stored as plain text. They’re stored as:

Base snapshot: Complete document state at a point in time
Operation log: All operations since the last snapshot

This design has several advantages:

Version history is free: Just replay operations from any snapshot
Efficient storage: Operations are small, snapshots are periodic
Fast recovery: Load snapshot, replay recent operations

flowchart LR
    subgraph Storage
        S1[Snapshot v100]
        O1[Op 101]
        O2[Op 102]
        O3[Op 103]
        S2[Snapshot v200]
        O4[Op 201]
    end
    
    S1 --> O1 --> O2 --> O3 --> S2 --> O4
    
    style S1 fill:#fef3c7,stroke:#d97706,stroke-width:2px
    style S2 fill:#fef3c7,stroke:#d97706,stroke-width:2px
    style O1 fill:#e0f2fe,stroke:#0284c7,stroke-width:2px
    style O2 fill:#e0f2fe,stroke:#0284c7,stroke-width:2px
    style O3 fill:#e0f2fe,stroke:#0284c7,stroke-width:2px
    style O4 fill:#e0f2fe,stroke:#0284c7,stroke-width:2px

To get the current document:

Load the latest snapshot
Replay all operations after that snapshot
You now have the current state

To get version history:

Load an older snapshot
Replay operations up to the desired point
Show that version to the user

Handling Cursor Positions

When multiple people edit a document, you see their cursors moving around. This is trickier than it sounds.

The Problem

Cursor positions are just numbers (index in the document). But when someone types before your cursor, your cursor position needs to shift.

Alice’s cursor is at position 10. Bob inserts “Hello” (5 characters) at position 3. Alice’s cursor should now be at position 15.

The Solution

Cursor positions are transformed using the same OT logic as operations:

function transformCursor(cursorPosition, operation) {
    if (operation.type === 'insert') {
        if (operation.position <= cursorPosition) {
            return cursorPosition + operation.content.length;
        }
    } else if (operation.type === 'delete') {
        if (operation.position < cursorPosition) {
            return cursorPosition - operation.length;
        }
    }
    return cursorPosition;
}

The server broadcasts cursor updates frequently (every few hundred milliseconds), and clients transform incoming cursor positions against any local pending operations.

Offline Editing

Google Docs works offline. You can edit documents without an internet connection, and when you reconnect, your changes merge with whatever others have done.

How It Works

Local storage: The document is cached in IndexedDB
Operation queue: Edits are saved locally with timestamps
Reconnection: Client sends all queued operations to server
Merge: Server applies OT to merge offline changes

sequenceDiagram
    participant Client
    participant LocalDB
    participant Server
    
    Note over Client,Server: Internet disconnects
    
    Client->>LocalDB: Save edit 1
    Client->>LocalDB: Save edit 2
    Client->>LocalDB: Save edit 3
    
    Note over Client,Server: Internet reconnects
    
    Client->>Server: Send edits 1, 2, 3
    Server->>Server: Transform against edits from other users
    Server->>Client: Transformed operations
    Client->>Client: Apply merged changes
    
    Note over Client,Server: Document is synced

The tricky part is handling conflicts between offline edits. If you deleted a sentence while offline, and someone else edited that sentence, what happens? OT handles this by making delete operations “win” against edits to deleted content. The edited text disappears, but no data corruption occurs.

Scaling Challenges

Google Docs faces several scaling challenges:

1. Hot Documents

A document shared with 1000 simultaneous editors generates massive traffic. Solutions:

Dedicated server instances for popular documents
Rate limiting operations per user
Batching multiple keystrokes into single operations

2. Connection Management

Millions of persistent WebSocket connections require:

Efficient connection pooling
Graceful handling of server failures
Connection migration when servers need to restart

3. Global Latency

Users are worldwide. A user in Tokyo editing with someone in London has 200+ milliseconds round trip. Solutions:

Edge servers closer to users
Optimistic local application (show your edit immediately, confirm later)
Conflict resolution happens server-side regardless of latency

OT vs CRDT: Two Approaches

Google Docs uses OT, but there’s an alternative: Conflict-Free Replicated Data Types (CRDTs). Let’s compare:

Aspect	Operational Transform (OT)	CRDT
Central server	Required	Not required
Offline support	Complex	Natural
Memory usage	Lower	Higher
Implementation	Simpler	More complex
Used by	Google Docs, Google Sheets	Figma, Apple Notes, Notion

How CRDTs Work

Instead of transforming operations, CRDTs assign unique IDs to every character. These IDs are designed so edits can be merged in any order and always converge to the same result.

flowchart TB
    subgraph OT["Operational Transform"]
        OT1[Client 1] --> SERVER[Central Server]
        OT2[Client 2] --> SERVER
        SERVER --> OT1
        SERVER --> OT2
    end
    
    subgraph CRDT["CRDT"]
        C1[Client 1] <--> C2[Client 2]
        C1 <--> C3[Client 3]
        C2 <--> C3
    end
    
    style SERVER fill:#fef3c7,stroke:#d97706,stroke-width:2px
    style C1 fill:#dcfce7,stroke:#16a34a,stroke-width:2px
    style C2 fill:#dcfce7,stroke:#16a34a,stroke-width:2px
    style C3 fill:#dcfce7,stroke:#16a34a,stroke-width:2px

CRDTs don’t need a central server because every operation is designed to be commutative and idempotent. Insert “X” at position between “A” and “B” will always put “X” between “A” and “B”, regardless of what other operations happen.

The downside: every character carries metadata (unique ID, timestamp, tombstone flags for deletions), which uses more memory.

What Developers Can Learn

Building Google Docs teaches several patterns that apply broadly:

1. Optimistic Updates

Don’t wait for the server. Apply changes locally, then reconcile with the server response. This makes the UI feel instant.

// Bad: Wait for server
async function handleKeypress(char) {
    await server.insert(char);
    updateUI();
}

// Good: Optimistic update
async function handleKeypress(char) {
    updateUILocally(char);  // Instant feedback
    const result = await server.insert(char);
    if (result.transformed) {
        reconcileUI(result);  // Fix if needed
    }
}

2. Event Sourcing

Store operations, not just final state. This gives you:

Complete history
Easy undo/redo
Debugging capability
Time travel

This pattern is used in many systems beyond docs: banking, order management, and any system where audit trails matter.

3. Conflict Resolution Strategies

When building any collaborative system, you need a conflict resolution strategy:

Last write wins: Simple but loses data
Merge: Try to keep both changes (OT/CRDT)
Lock: Prevent concurrent edits
Branch: Let users resolve conflicts manually (like Git)

Google Docs uses merge (OT). Git uses branch. Different tradeoffs for different use cases.

4. Connection Management

Any real-time application needs robust connection handling:

Automatic reconnection with exponential backoff
Heartbeats to detect dead connections
Message queuing during disconnection
State synchronization after reconnection

For a deep dive into real-time connections, check out WebSockets Explained.

Building Your Own Collaborative Editor

If you want to build something similar, here are your options:

Existing Libraries

Yjs (CRDT): Popular JavaScript library, works with many editors
Automerge (CRDT): JSON-like CRDT, good for structured data
ShareDB (OT): Operational transform library from the creators of Google Wave

Roll Your Own

If you must build from scratch:

Start with a simple text model (array of characters)
Implement insert and delete operations
Build transform functions for each operation pair
Add a central server for ordering
Add WebSocket connections for real-time sync
Add local persistence for offline support
Add cursor synchronization
Add undo/redo using operation history

This is weeks of work minimum. Use a library if you can.

Key Takeaways

OT is the core algorithm: Operational Transform allows concurrent edits by transforming operations to account for concurrent changes.
Central server provides ordering: All operations pass through a central server that assigns sequence numbers and resolves conflicts.
WebSockets enable real-time: Persistent connections eliminate HTTP overhead for each keystroke.
Optimistic updates feel fast: Apply changes locally first, then reconcile with the server.
Operations, not state: Storing operations instead of document snapshots enables version history and efficient sync.
Cursors need transformation too: Cursor positions must be transformed when edits happen before them.
Offline is hard but doable: Queue operations locally, replay and transform on reconnection.
CRDTs are an alternative: For peer-to-peer or offline-first apps, CRDTs may be better than OT.

Further Reading:

How Stock Brokers Push 1 Million Price Updates Per Second - Another real-time system design
WebSockets Explained - Deep dive into the underlying connection protocol
How Kafka Works - Message queues for high-throughput systems
Jupiter Collaboration System - Academic paper on OT from Xerox PARC
Yjs Documentation - Popular CRDT library for collaborative editing

Building real-time collaborative features? The same patterns that power Google Docs work for chat applications, multiplayer games, and any system where multiple users need to see changes instantly. Start simple, optimize later, and always handle the offline case.

How Google Docs Works Behind the Scenes

How Google Docs syncs edits from millions of users typing at the same time without losing a single character

The Core Problem: Concurrent Editing

Operational Transform: The Solution

How OT Works

Transform Rules

The Server Is the Boss

System Architecture

High Level Architecture

Real-Time Connection Layer

The Collaboration Service

Document Storage

Handling Cursor Positions

The Problem

The Solution

Offline Editing

How It Works

Scaling Challenges

1. Hot Documents

2. Connection Management

3. Global Latency

OT vs CRDT: Two Approaches

How CRDTs Work

What Developers Can Learn

1. Optimistic Updates

2. Event Sourcing

3. Conflict Resolution Strategies

4. Connection Management

Building Your Own Collaborative Editor

Existing Libraries

Roll Your Own

Key Takeaways

Subscribe via RSS Feed

More from System design

The Core Problem: Concurrent Editing

Operational Transform: The Solution

How OT Works

Transform Rules

The Server Is the Boss

System Architecture

High Level Architecture

Real-Time Connection Layer

The Collaboration Service

Document Storage

Handling Cursor Positions

The Problem

The Solution

Offline Editing

How It Works

Scaling Challenges

1. Hot Documents

2. Connection Management

3. Global Latency

OT vs CRDT: Two Approaches

How CRDTs Work

What Developers Can Learn

1. Optimistic Updates

2. Event Sourcing

3. Conflict Resolution Strategies

4. Connection Management

Building Your Own Collaborative Editor

Existing Libraries

Roll Your Own

Key Takeaways

Subscribe via RSS Feed

Share this:

Related Posts

More from System design