WhatsApp served 900 million users with just 50 engineers. Read that again. Most companies would throw thousands of developers at that problem. WhatsApp did it with a team that could fit in a small office.
This is not luck. It is the result of deliberate technical choices that most developers overlook. Erlang instead of Java. FreeBSD instead of Linux. A focus on simplicity over features.
This post breaks down exactly how they did it. No fluff. Just the architecture decisions that made massive scale possible with minimal resources.
WhatsApp By The Numbers
Before diving into the architecture, here is what WhatsApp handles today:
| Metric | Value |
|---|---|
| Daily active users | 2+ billion |
| Messages per day | 100+ billion |
| Engineers (at acquisition) | ~50 |
| Servers (at acquisition) | ~550 |
| Users per engineer | 18 million |
| Connections per server | 2+ million |
That last number is key. Two million concurrent connections on a single server. Most systems struggle with 10,000. How did they pull this off?
The Technology Stack
WhatsApp’s stack is unconventional. They picked tools that most developers have never used, and it paid off massively.
| Component | Technology | Why |
|---|---|---|
| Language | Erlang/OTP | Built for telecom-scale concurrency |
| VM | BEAM | Lightweight processes, not OS threads |
| Database | Mnesia + MySQL | In-memory speed + persistent storage |
| OS | FreeBSD | Superior networking performance |
| Protocol | Modified XMPP | Binary encoding for mobile efficiency |
| Encryption | Signal Protocol | End-to-end security |
Let us break down each choice.
Erlang: The Secret Weapon
Most developers have never written Erlang. It is not trendy. It does not have a massive ecosystem. But it was designed for exactly what WhatsApp needed: systems that never go down.
Erlang was created by Ericsson in the 1980s for telephone switches. Those systems needed 99.999% uptime (about 5 minutes of downtime per year). The language was built from the ground up for:
- Massive concurrency
- Fault tolerance
- Hot code updates
- Distributed computing
The Actor Model
Erlang uses the actor model for concurrency. Instead of threads sharing memory (and fighting over locks), you have isolated processes that communicate through messages.
Here is what a simple Erlang process looks like:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-module(connection_handler).
-export([start/1, loop/1]).
start(UserId) ->
spawn(?MODULE, loop, [UserId]).
loop(UserId) ->
receive
{send_message, To, Content} ->
% Forward message to recipient's process
To ! {incoming_message, UserId, Content},
loop(UserId);
{incoming_message, From, Content} ->
% Deliver to connected client
deliver_to_client(From, Content),
loop(UserId);
disconnect ->
ok % Process terminates
end.
Each WhatsApp user connection gets its own process like this. The process:
- Lives in memory while the user is connected
- Handles all incoming and outgoing messages for that user
- Maintains the message queue for offline delivery
- Dies when the user disconnects
Why Erlang Processes Beat OS Threads
The magic is in how lightweight these processes are:
| Feature | OS Thread | Erlang Process |
|---|---|---|
| Memory | ~1 MB | ~2 KB |
| Creation time | Milliseconds | Microseconds |
| Context switch | Expensive | Cheap |
| Max per machine | Thousands | Millions |
| Crash isolation | Takes down app | Process restarts |
A single WhatsApp server can run 2+ million Erlang processes. Try that with Java threads.
Supervisors: Let It Crash
Erlang has a philosophy that sounds crazy until you understand it: “let it crash.”
Instead of writing defensive code with try-catch blocks everywhere, Erlang processes are monitored by supervisors. When a process crashes, the supervisor restarts it with a clean state.
graph TB
subgraph supervision[Supervision Tree]
S[fa:fa-eye Supervisor]
W1[fa:fa-user Worker: User A]
W2[fa:fa-user Worker: User B]
W3[fa:fa-user Worker: User C]
end
S --> W1
S --> W2
S --> W3
W2 -.->|Crashes| S
S -.->|Restarts| W2
If User B’s process crashes due to a bug, the supervisor restarts it. Users A and C never notice. The crashed user might need to reconnect, but the system keeps running.
This is how WhatsApp achieves near-zero downtime. Failures are isolated and automatically recovered.
Hot Code Swapping
Here is something most languages cannot do: update running code without stopping the application.
Erlang’s BEAM virtual machine supports loading new code while the old code is still running. Active processes continue with the old code until they reach a checkpoint, then switch to the new version.
WhatsApp could deploy bug fixes and new features without disconnecting a single user. For a global messaging app, this is invaluable.
Architecture Overview
Here is how the pieces fit together:
graph TB
subgraph clients[Client Devices]
C1[fa:fa-mobile Phone A]
C2[fa:fa-mobile Phone B]
C3[fa:fa-mobile Phone C]
end
subgraph edge[Edge Layer]
LB[fa:fa-sitemap Load Balancer]
end
subgraph servers[Erlang Server Cluster]
S1[fa:fa-server Server 1]
S2[fa:fa-server Server 2]
S3[fa:fa-server Server N]
end
subgraph data[Data Layer]
MN[fa:fa-database Mnesia Cluster]
MY[fa:fa-database MySQL Shards]
RK[fa:fa-database RocksDB]
end
subgraph media[Media Layer]
CDN[fa:fa-cloud CDN]
OBJ[fa:fa-hdd Object Storage]
end
C1 --> LB
C2 --> LB
C3 --> LB
LB --> S1
LB --> S2
LB --> S3
S1 <--> MN
S2 <--> MN
S3 <--> MN
MN --> MY
MN --> RK
S1 --> CDN
CDN --> OBJ
Each server runs thousands of Erlang processes, one per connected user. The Mnesia cluster stores session data and routing information. MySQL handles persistent user data. The CDN serves media files.
How Messages Flow
Let us trace what happens when you send a message:
sequenceDiagram
participant A as Phone A
participant PA as Process A
participant MN as Mnesia
participant PB as Process B
participant B as Phone B
A->>PA: Send message to B
PA->>PA: Queue message
PA->>A: First tick
PA->>MN: Lookup B's process
MN-->>PA: Process B location
alt B is online
PA->>PB: Forward message
PB->>B: Deliver message
B->>PB: ACK received
PB->>PA: Delivery confirmed
PA->>A: Second tick
PA->>PA: Delete from queue
else B is offline
PA->>PA: Keep in queue
Note over PA: Wait for B to connect
B->>PB: B comes online
PA->>PB: Forward queued message
PB->>B: Deliver message
end
Key points:
- First tick appears when the server receives your message
- Second tick appears when the recipient’s device acknowledges delivery
- Blue ticks are sent by the recipient’s app when the message is read
- Messages are deleted from servers after delivery
This is why WhatsApp does not store your chat history on their servers. The source of truth is your device. The server is just a relay.
Mnesia: The In-Memory Database
WhatsApp uses Mnesia, an Erlang-native database, for data that needs to be fast:
- Session information (which server handles which user)
- Routing tables (where to find a user’s process)
- Presence data (online/offline status)
- Group membership
Mnesia is not like PostgreSQL or MySQL. It is tightly integrated with Erlang:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
% Define a table schema
-record(user_session, {
user_id,
server_node,
process_id,
connected_at
}).
% Lookup is a simple pattern match
find_user_process(UserId) ->
mnesia:transaction(fun() ->
case mnesia:read(user_session, UserId) of
[Session] -> {ok, Session#user_session.process_id};
[] -> {error, not_found}
end
end).
Why Mnesia Works for WhatsApp
| Feature | Benefit |
|---|---|
| RAM-first storage | Sub-millisecond lookups |
| Native Erlang types | No ORM overhead |
| Distributed replication | Data survives node failures |
| Transactions | Consistent reads and writes |
For persistent data like user profiles and message history, WhatsApp uses MySQL shards. Each shard holds data for a subset of users, allowing horizontal scaling.
FreeBSD Over Linux
This choice surprises most developers. Linux dominates the server world. Why pick FreeBSD?
The WhatsApp team found FreeBSD’s networking stack handled their workload better:
| Aspect | Why FreeBSD Won |
|---|---|
| kqueue | More efficient than epoll for their use case |
| TCP tuning | Fine-grained kernel parameters |
| Stability | Fewer surprises under extreme load |
| Connection limits | Pushed past 2 million per server |
The lesson here is not “use FreeBSD.” It is “test your assumptions.” The WhatsApp team benchmarked both operating systems under realistic load and picked what worked best for their specific use case.
The Modified XMPP Protocol
WhatsApp started with XMPP (Extensible Messaging and Presence Protocol), an open standard for chat. But standard XMPP has problems for mobile:
- XML is verbose (wastes bandwidth)
- Parsing XML burns CPU and battery
- The handshake is slow on flaky connections
WhatsApp’s modifications:
- Binary encoding instead of XML (smaller messages, faster parsing)
- Compressed payloads for text messages
- Streamlined handshake for faster reconnection
- Optimized for intermittent connectivity (mobile networks drop constantly)
The result: faster connections, less data usage, longer battery life.
End-to-End Encryption
In 2016, WhatsApp rolled out end-to-end encryption for all messages using the Signal Protocol. This was a massive undertaking for an app with over a billion users.
How Signal Protocol Works
sequenceDiagram
participant A as Alice
participant S as WhatsApp Server
participant B as Bob
Note over A,B: Initial Key Exchange
A->>S: Alice's public keys
B->>S: Bob's public keys
A->>S: Request Bob's keys
S-->>A: Bob's public keys
Note over A,B: Sending a Message
A->>A: Generate message key
A->>A: Encrypt message
A->>S: Encrypted message
S->>B: Encrypted message
B->>B: Decrypt with key
Note over A,B: Double Ratchet
A->>A: Derive new keys
B->>B: Derive new keys
The Double Ratchet algorithm means each message gets a unique key. Even if an attacker gets one key, they cannot decrypt past or future messages.
Key properties:
- Forward secrecy: Compromised keys do not reveal old messages
- Future secrecy: Compromised keys do not reveal future messages
- Deniability: No cryptographic proof of who sent a message
WhatsApp servers only see encrypted blobs. They cannot read your messages.
Media Handling
Text messages are tiny. Images and videos are not. WhatsApp handles media differently:
sequenceDiagram
participant A as Sender
participant S as WhatsApp Server
participant CDN as CDN
participant B as Recipient
A->>A: Encrypt media file
A->>CDN: Upload encrypted file
CDN-->>A: File URL
A->>S: Send message with URL + key
S->>B: Forward message
B->>CDN: Download encrypted file
B->>B: Decrypt with key
- Media is encrypted on your device with a random key
- The encrypted blob is uploaded to a CDN
- Only the URL and decryption key go through WhatsApp’s servers
- Recipients download from the CDN and decrypt locally
This keeps the message servers lightweight. They only handle small text payloads, not gigabytes of videos.
Multi-Device Support
For years, WhatsApp only worked on one phone. Your messages lived on that device. Adding multi-device support required rethinking the architecture.
The challenge: end-to-end encryption means the server cannot relay decrypted messages between devices.
The solution: each device has its own encryption keys. When you send a message to someone with multiple devices, your app encrypts it separately for each device.
graph LR
subgraph sender[Sender]
A[fa:fa-mobile Alice Phone]
end
subgraph recipient[Bob's Devices]
B1[fa:fa-mobile Phone]
B2[fa:fa-tablet Tablet]
B3[fa:fa-desktop Web]
end
A -->|Encrypted for Phone| B1
A -->|Encrypted for Tablet| B2
A -->|Encrypted for Web| B3
This is more complex and uses more bandwidth, but maintains end-to-end encryption.
Lessons for Developers
WhatsApp’s success offers concrete takeaways you can apply today:
1. Pick the Right Tool, Not the Popular One
WhatsApp chose Erlang when everyone else used Java or Python. They chose FreeBSD when everyone used Linux. These unconventional choices matched their specific requirements.
Do not pick technologies because they are popular. Pick them because they solve your actual problems.
2. Embrace Simplicity
WhatsApp’s philosophy: do one thing extremely well. They resisted adding features that would complicate the architecture. The core product (messaging) stayed simple and fast.
Every feature you add is complexity you must maintain. Be ruthless about what you include.
3. Design for Failure
The “let it crash” philosophy is powerful. Instead of trying to handle every possible error, design systems that recover automatically. Use supervision trees, health checks, and automatic restarts.
Systems will fail. The question is whether they recover gracefully.
4. Stateless Where Possible
WhatsApp servers do not store message history. Once delivered, messages are deleted. This keeps the infrastructure simple and reduces storage costs dramatically.
Ask yourself: what data do you actually need to keep? Often the answer is less than you think.
5. Measure Before Optimizing
The WhatsApp team benchmarked FreeBSD vs Linux under realistic load. They did not guess. They tested.
Before optimizing anything, measure. Your intuition about bottlenecks is often wrong.
6. Vertical Scaling Still Works
Before adding more servers, WhatsApp pushed each server to handle 2+ million connections. Vertical scaling (making each server do more) is often simpler than horizontal scaling (adding more servers).
Add complexity only when you have exhausted simpler options.
7. The Right Abstraction Matters
Erlang’s process model matches WhatsApp’s problem domain perfectly. One process per user. Message passing between processes. This made the code straightforward.
When your abstractions match your problem, the code almost writes itself.
8. Security Cannot Be Bolted On
WhatsApp implemented end-to-end encryption at scale. This required rethinking message delivery, key management, and multi-device support. It was hard.
If security matters for your application, design for it from the start.
What You Can Apply Today
You probably will not rewrite your app in Erlang. But you can apply these principles:
-
Consider the actor model: Languages like Elixir (Erlang with nicer syntax), Akka (for Java/Scala), and even Go channels give you similar patterns.
-
Question your defaults: Why are you using that database? That operating system? That framework? Have you tested alternatives?
-
Simplify aggressively: What features can you remove? What data can you stop storing? What complexity is not earning its keep?
-
Automate recovery: Can your system restart failed components automatically? What happens when a database connection drops?
-
Benchmark your stack: Run load tests. Find your actual bottlenecks. They are probably not where you expect.
Further Reading
- Slack System Design - Different approach to messaging at scale
- Shopify System Design - How they handle massive traffic spikes
- Role of Queues in System Design - Deep dive into message queues
- WebSockets Explained - The protocol behind real-time messaging
WhatsApp proved that a small team with the right technology choices can outperform armies of engineers with conventional stacks. The lesson is not to copy their exact stack. It is to think critically about your tools, embrace simplicity, and design for the problems you actually have.
Building a messaging system or preparing for system design interviews? Check out our other system design posts for more architecture deep dives.