What are Snowflake IDs?
Snowflake IDs are 64-bit unique identifiers designed for distributed systems. Instead of asking a central database "what's the next ID?", each server can generate its own IDs independently - and they're guaranteed to be unique. Twitter created this approach in 2010 to handle their massive scale.
Example ID: 1541815603606036480 - looks like a random number, but it contains a timestamp, machine ID, and sequence number packed together.
The Problem: IDs at Scale
Auto-Increment Doesn't Scale
Traditional database IDs:
Multiple servers need IDs
↓
All ask the same database
↓
Database becomes bottleneck
↓
Single point of failure!
Problem: Can't scale horizontally. One DB goes down, no more IDs.
Snowflake IDs Scale
Distributed ID generation:
Multiple servers need IDs
↓
Each server generates its own
↓
No coordination needed
↓
IDs still unique!
Result: No bottleneck. No single point of failure. Infinite scale.
Snowflake ID Structure
A Snowflake ID is 64 bits split into parts:
- Timestamp (41 bits): Milliseconds since a custom epoch. Gives you ~69 years of IDs.
- Machine ID (10 bits): Which server generated this. Supports up to 1024 machines.
- Sequence (12 bits): Counter for IDs in the same millisecond. Up to 4096 IDs per ms per machine.
💡 Math: 1024 machines × 4096 IDs/ms = 4 million IDs per millisecond across your cluster.
How IDs Are Generated
sequenceDiagram
participant App
participant Generator as ID Generator
participant Clock as System Clock
App->>Generator: Need a new ID
Generator->>Clock: What time is it?
Clock-->>Generator: 1702684800000 ms
Note over Generator: Same millisecond as last ID?
alt Same millisecond
Generator->>Generator: Increment sequence (0→1→2...)
else New millisecond
Generator->>Generator: Reset sequence to 0
end
Note over Generator: Combine: timestamp + machine_id + sequence
Generator-->>App: 1541815603606036480
Why Snowflake IDs?
Time-Sortable
Since timestamp is the first part, IDs naturally sort in chronological order. Newer records have bigger IDs. Great for "show me recent items" queries.
No Coordination
Servers don't need to talk to each other or a central service. Each one generates IDs on its own. No network calls, no waiting.
Compact
64 bits fits in a single long integer. Way smaller than UUIDs (128 bits) and works as a primary key in any database.
Embedded Timestamp
You can extract when an ID was created just by looking at it. No need to store a separate created_at field.
Distributed Generation
graph TB
subgraph Datacenter
S1[Server 1 - Machine ID: 001] --> ID1[ID: timestamp-001-seq]
S2[Server 2 - Machine ID: 002] --> ID2[ID: timestamp-002-seq]
S3[Server 3 - Machine ID: 003] --> ID3[ID: timestamp-003-seq]
end
ID1 --> DB[(Database)]
ID2 --> DB
ID3 --> DB
style S1 fill:#e0f2fe,stroke:#0369a1,stroke-width:2px
style S2 fill:#e0f2fe,stroke:#0369a1,stroke-width:2px
style S3 fill:#e0f2fe,stroke:#0369a1,stroke-width:2px
style DB fill:#fef3c7,stroke:#f59e0b,stroke-width:2px
Each server has a unique machine ID. Even if they generate IDs at the exact same millisecond, the machine ID makes them different.
Who Uses Snowflake IDs?
- Twitter: Created the original Snowflake. Every tweet ID is a Snowflake.
- Discord: Uses Snowflakes for message IDs, user IDs, server IDs - everything.
- Instagram: Modified version for photo IDs with sharding built in.
- Sony: Uses Snowflake-style IDs for PlayStation Network.
Snowflake vs UUID
| Snowflake ID | UUID | |
|---|---|---|
| Size | 64 bits (8 bytes) | 128 bits (16 bytes) |
| Sortable | ✅ Yes, by time | ❌ Random order |
| Setup | Need to assign machine IDs | Zero config |
| Index perf | Great (sequential) | Poor (random) |
⚠️ Clock drift: Snowflakes depend on system clocks. If a server's clock goes backward, you could get duplicate IDs. Most implementations handle this by waiting or throwing an error.