WhatsApp served 900 million users with just 50 engineers. Read that again. Most companies would throw thousands of developers at that problem. WhatsApp did it with a team that could fit in a small office.

This is not luck. It is the result of deliberate technical choices that most developers overlook. Erlang instead of Java. FreeBSD instead of Linux. A focus on simplicity over features.

This post breaks down exactly how they did it. No fluff. Just the architecture decisions that made massive scale possible with minimal resources.

WhatsApp By The Numbers

Before diving into the architecture, here is what WhatsApp handles today:

Metric Value
Daily active users 2+ billion
Messages per day 100+ billion
Engineers (at acquisition) ~50
Servers (at acquisition) ~550
Users per engineer 18 million
Connections per server 2+ million

That last number is key. Two million concurrent connections on a single server. Most systems struggle with 10,000. How did they pull this off?

The Technology Stack

WhatsApp’s stack is unconventional. They picked tools that most developers have never used, and it paid off massively.

Component Technology Why
Language Erlang/OTP Built for telecom-scale concurrency
VM BEAM Lightweight processes, not OS threads
Database Mnesia + MySQL In-memory speed + persistent storage
OS FreeBSD Superior networking performance
Protocol Modified XMPP Binary encoding for mobile efficiency
Encryption Signal Protocol End-to-end security

Let us break down each choice.

Erlang: The Secret Weapon

Most developers have never written Erlang. It is not trendy. It does not have a massive ecosystem. But it was designed for exactly what WhatsApp needed: systems that never go down.

Erlang was created by Ericsson in the 1980s for telephone switches. Those systems needed 99.999% uptime (about 5 minutes of downtime per year). The language was built from the ground up for:

  • Massive concurrency
  • Fault tolerance
  • Hot code updates
  • Distributed computing

The Actor Model

Erlang uses the actor model for concurrency. Instead of threads sharing memory (and fighting over locks), you have isolated processes that communicate through messages.

Here is what a simple Erlang process looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-module(connection_handler).
-export([start/1, loop/1]).

start(UserId) ->
    spawn(?MODULE, loop, [UserId]).

loop(UserId) ->
    receive
        {send_message, To, Content} ->
            % Forward message to recipient's process
            To ! {incoming_message, UserId, Content},
            loop(UserId);
        {incoming_message, From, Content} ->
            % Deliver to connected client
            deliver_to_client(From, Content),
            loop(UserId);
        disconnect ->
            ok  % Process terminates
    end.

Each WhatsApp user connection gets its own process like this. The process:

  • Lives in memory while the user is connected
  • Handles all incoming and outgoing messages for that user
  • Maintains the message queue for offline delivery
  • Dies when the user disconnects

Why Erlang Processes Beat OS Threads

The magic is in how lightweight these processes are:

Feature OS Thread Erlang Process
Memory ~1 MB ~2 KB
Creation time Milliseconds Microseconds
Context switch Expensive Cheap
Max per machine Thousands Millions
Crash isolation Takes down app Process restarts

A single WhatsApp server can run 2+ million Erlang processes. Try that with Java threads.

Supervisors: Let It Crash

Erlang has a philosophy that sounds crazy until you understand it: “let it crash.”

Instead of writing defensive code with try-catch blocks everywhere, Erlang processes are monitored by supervisors. When a process crashes, the supervisor restarts it with a clean state.


graph TB
    subgraph supervision[Supervision Tree]
        S[fa:fa-eye Supervisor]
        W1[fa:fa-user Worker: User A]
        W2[fa:fa-user Worker: User B]
        W3[fa:fa-user Worker: User C]
    end
    
    S --> W1
    S --> W2
    S --> W3
    
    W2 -.->|Crashes| S
    S -.->|Restarts| W2

If User B’s process crashes due to a bug, the supervisor restarts it. Users A and C never notice. The crashed user might need to reconnect, but the system keeps running.

This is how WhatsApp achieves near-zero downtime. Failures are isolated and automatically recovered.

Hot Code Swapping

Here is something most languages cannot do: update running code without stopping the application.

Erlang’s BEAM virtual machine supports loading new code while the old code is still running. Active processes continue with the old code until they reach a checkpoint, then switch to the new version.

WhatsApp could deploy bug fixes and new features without disconnecting a single user. For a global messaging app, this is invaluable.

Architecture Overview

Here is how the pieces fit together:


graph TB
    subgraph clients[Client Devices]
        C1[fa:fa-mobile Phone A]
        C2[fa:fa-mobile Phone B]
        C3[fa:fa-mobile Phone C]
    end
    
    subgraph edge[Edge Layer]
        LB[fa:fa-sitemap Load Balancer]
    end
    
    subgraph servers[Erlang Server Cluster]
        S1[fa:fa-server Server 1]
        S2[fa:fa-server Server 2]
        S3[fa:fa-server Server N]
    end
    
    subgraph data[Data Layer]
        MN[fa:fa-database Mnesia Cluster]
        MY[fa:fa-database MySQL Shards]
        RK[fa:fa-database RocksDB]
    end
    
    subgraph media[Media Layer]
        CDN[fa:fa-cloud CDN]
        OBJ[fa:fa-hdd Object Storage]
    end
    
    C1 --> LB
    C2 --> LB
    C3 --> LB
    
    LB --> S1
    LB --> S2
    LB --> S3
    
    S1 <--> MN
    S2 <--> MN
    S3 <--> MN
    
    MN --> MY
    MN --> RK
    
    S1 --> CDN
    CDN --> OBJ

Each server runs thousands of Erlang processes, one per connected user. The Mnesia cluster stores session data and routing information. MySQL handles persistent user data. The CDN serves media files.

How Messages Flow

Let us trace what happens when you send a message:


sequenceDiagram
    participant A as Phone A
    participant PA as Process A
    participant MN as Mnesia
    participant PB as Process B
    participant B as Phone B
    
    A->>PA: Send message to B
    PA->>PA: Queue message
    PA->>A: First tick
    PA->>MN: Lookup B's process
    MN-->>PA: Process B location
    
    alt B is online
        PA->>PB: Forward message
        PB->>B: Deliver message
        B->>PB: ACK received
        PB->>PA: Delivery confirmed
        PA->>A: Second tick
        PA->>PA: Delete from queue
    else B is offline
        PA->>PA: Keep in queue
        Note over PA: Wait for B to connect
        B->>PB: B comes online
        PA->>PB: Forward queued message
        PB->>B: Deliver message
    end

Key points:

  1. First tick appears when the server receives your message
  2. Second tick appears when the recipient’s device acknowledges delivery
  3. Blue ticks are sent by the recipient’s app when the message is read
  4. Messages are deleted from servers after delivery

This is why WhatsApp does not store your chat history on their servers. The source of truth is your device. The server is just a relay.

Mnesia: The In-Memory Database

WhatsApp uses Mnesia, an Erlang-native database, for data that needs to be fast:

  • Session information (which server handles which user)
  • Routing tables (where to find a user’s process)
  • Presence data (online/offline status)
  • Group membership

Mnesia is not like PostgreSQL or MySQL. It is tightly integrated with Erlang:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
% Define a table schema
-record(user_session, {
    user_id,
    server_node,
    process_id,
    connected_at
}).

% Lookup is a simple pattern match
find_user_process(UserId) ->
    mnesia:transaction(fun() ->
        case mnesia:read(user_session, UserId) of
            [Session] -> {ok, Session#user_session.process_id};
            [] -> {error, not_found}
        end
    end).

Why Mnesia Works for WhatsApp

Feature Benefit
RAM-first storage Sub-millisecond lookups
Native Erlang types No ORM overhead
Distributed replication Data survives node failures
Transactions Consistent reads and writes

For persistent data like user profiles and message history, WhatsApp uses MySQL shards. Each shard holds data for a subset of users, allowing horizontal scaling.

FreeBSD Over Linux

This choice surprises most developers. Linux dominates the server world. Why pick FreeBSD?

The WhatsApp team found FreeBSD’s networking stack handled their workload better:

Aspect Why FreeBSD Won
kqueue More efficient than epoll for their use case
TCP tuning Fine-grained kernel parameters
Stability Fewer surprises under extreme load
Connection limits Pushed past 2 million per server

The lesson here is not “use FreeBSD.” It is “test your assumptions.” The WhatsApp team benchmarked both operating systems under realistic load and picked what worked best for their specific use case.

The Modified XMPP Protocol

WhatsApp started with XMPP (Extensible Messaging and Presence Protocol), an open standard for chat. But standard XMPP has problems for mobile:

  • XML is verbose (wastes bandwidth)
  • Parsing XML burns CPU and battery
  • The handshake is slow on flaky connections

WhatsApp’s modifications:

  1. Binary encoding instead of XML (smaller messages, faster parsing)
  2. Compressed payloads for text messages
  3. Streamlined handshake for faster reconnection
  4. Optimized for intermittent connectivity (mobile networks drop constantly)

The result: faster connections, less data usage, longer battery life.

End-to-End Encryption

In 2016, WhatsApp rolled out end-to-end encryption for all messages using the Signal Protocol. This was a massive undertaking for an app with over a billion users.

How Signal Protocol Works


sequenceDiagram
    participant A as Alice
    participant S as WhatsApp Server
    participant B as Bob
    
    Note over A,B: Initial Key Exchange
    A->>S: Alice's public keys
    B->>S: Bob's public keys
    A->>S: Request Bob's keys
    S-->>A: Bob's public keys
    
    Note over A,B: Sending a Message
    A->>A: Generate message key
    A->>A: Encrypt message
    A->>S: Encrypted message
    S->>B: Encrypted message
    B->>B: Decrypt with key
    
    Note over A,B: Double Ratchet
    A->>A: Derive new keys
    B->>B: Derive new keys

The Double Ratchet algorithm means each message gets a unique key. Even if an attacker gets one key, they cannot decrypt past or future messages.

Key properties:

  • Forward secrecy: Compromised keys do not reveal old messages
  • Future secrecy: Compromised keys do not reveal future messages
  • Deniability: No cryptographic proof of who sent a message

WhatsApp servers only see encrypted blobs. They cannot read your messages.

Media Handling

Text messages are tiny. Images and videos are not. WhatsApp handles media differently:


sequenceDiagram
    participant A as Sender
    participant S as WhatsApp Server
    participant CDN as CDN
    participant B as Recipient
    
    A->>A: Encrypt media file
    A->>CDN: Upload encrypted file
    CDN-->>A: File URL
    A->>S: Send message with URL + key
    S->>B: Forward message
    B->>CDN: Download encrypted file
    B->>B: Decrypt with key
  1. Media is encrypted on your device with a random key
  2. The encrypted blob is uploaded to a CDN
  3. Only the URL and decryption key go through WhatsApp’s servers
  4. Recipients download from the CDN and decrypt locally

This keeps the message servers lightweight. They only handle small text payloads, not gigabytes of videos.

Multi-Device Support

For years, WhatsApp only worked on one phone. Your messages lived on that device. Adding multi-device support required rethinking the architecture.

The challenge: end-to-end encryption means the server cannot relay decrypted messages between devices.

The solution: each device has its own encryption keys. When you send a message to someone with multiple devices, your app encrypts it separately for each device.


graph LR
    subgraph sender[Sender]
        A[fa:fa-mobile Alice Phone]
    end
    
    subgraph recipient[Bob's Devices]
        B1[fa:fa-mobile Phone]
        B2[fa:fa-tablet Tablet]
        B3[fa:fa-desktop Web]
    end
    
    A -->|Encrypted for Phone| B1
    A -->|Encrypted for Tablet| B2
    A -->|Encrypted for Web| B3

This is more complex and uses more bandwidth, but maintains end-to-end encryption.

Lessons for Developers

WhatsApp’s success offers concrete takeaways you can apply today:

WhatsApp chose Erlang when everyone else used Java or Python. They chose FreeBSD when everyone used Linux. These unconventional choices matched their specific requirements.

Do not pick technologies because they are popular. Pick them because they solve your actual problems.

2. Embrace Simplicity

WhatsApp’s philosophy: do one thing extremely well. They resisted adding features that would complicate the architecture. The core product (messaging) stayed simple and fast.

Every feature you add is complexity you must maintain. Be ruthless about what you include.

3. Design for Failure

The “let it crash” philosophy is powerful. Instead of trying to handle every possible error, design systems that recover automatically. Use supervision trees, health checks, and automatic restarts.

Systems will fail. The question is whether they recover gracefully.

4. Stateless Where Possible

WhatsApp servers do not store message history. Once delivered, messages are deleted. This keeps the infrastructure simple and reduces storage costs dramatically.

Ask yourself: what data do you actually need to keep? Often the answer is less than you think.

5. Measure Before Optimizing

The WhatsApp team benchmarked FreeBSD vs Linux under realistic load. They did not guess. They tested.

Before optimizing anything, measure. Your intuition about bottlenecks is often wrong.

6. Vertical Scaling Still Works

Before adding more servers, WhatsApp pushed each server to handle 2+ million connections. Vertical scaling (making each server do more) is often simpler than horizontal scaling (adding more servers).

Add complexity only when you have exhausted simpler options.

7. The Right Abstraction Matters

Erlang’s process model matches WhatsApp’s problem domain perfectly. One process per user. Message passing between processes. This made the code straightforward.

When your abstractions match your problem, the code almost writes itself.

8. Security Cannot Be Bolted On

WhatsApp implemented end-to-end encryption at scale. This required rethinking message delivery, key management, and multi-device support. It was hard.

If security matters for your application, design for it from the start.

What You Can Apply Today

You probably will not rewrite your app in Erlang. But you can apply these principles:

  1. Consider the actor model: Languages like Elixir (Erlang with nicer syntax), Akka (for Java/Scala), and even Go channels give you similar patterns.

  2. Question your defaults: Why are you using that database? That operating system? That framework? Have you tested alternatives?

  3. Simplify aggressively: What features can you remove? What data can you stop storing? What complexity is not earning its keep?

  4. Automate recovery: Can your system restart failed components automatically? What happens when a database connection drops?

  5. Benchmark your stack: Run load tests. Find your actual bottlenecks. They are probably not where you expect.

Further Reading


WhatsApp proved that a small team with the right technology choices can outperform armies of engineers with conventional stacks. The lesson is not to copy their exact stack. It is to think critically about your tools, embrace simplicity, and design for the problems you actually have.

Building a messaging system or preparing for system design interviews? Check out our other system design posts for more architecture deep dives.