What is a Snowflake ID?

A Snowflake ID is a 64-bit unique identifier used in distributed systems. It combines a timestamp (41 bits), machine/datacenter ID (10 bits), and sequence number (12 bits). This structure allows multiple servers to generate unique IDs independently without coordination, while keeping IDs roughly time-ordered.

Why use Snowflake IDs instead of UUIDs?

Snowflake IDs are smaller (64-bit vs 128-bit UUID), time-sortable (can sort by creation time), and more efficient as database primary keys. UUIDs are random, causing poor index performance. Snowflake IDs maintain locality of reference and work better with B-tree indexes.

How do Snowflake IDs stay unique across servers?

Each server gets a unique machine ID (10 bits = 1024 possible IDs). Combined with millisecond timestamp and 12-bit sequence (4096 IDs per millisecond), each server can generate 4 million unique IDs per second without any coordination with other servers.

Can you extract the timestamp from a Snowflake ID?

Yes, the first 41 bits contain milliseconds since a custom epoch. Right-shift the ID by 22 bits, add the epoch timestamp, and you get the creation time. This is useful for time-based queries and debugging. Discord and Twitter both expose this feature.

What happens when Snowflake ID sequence overflows?

If a server generates more than 4096 IDs in the same millisecond (sequence exhausted), it waits until the next millisecond to continue. This is rare in practice - 4096 IDs per millisecond per server is extremely high throughput. The wait ensures uniqueness is never compromised.

@Ajit5ingh

Snowflake IDs

Generating unique IDs in distributed systems

What are Snowflake IDs?

Snowflake IDs are 64-bit unique identifiers designed for distributed systems. Instead of asking a central database "what's the next ID?", each server can generate its own IDs independently - and they're guaranteed to be unique. Twitter created this approach in 2010 to handle their massive scale.

Example ID: 1541815603606036480 - looks like a random number, but it contains a timestamp, machine ID, and sequence number packed together.

Try Snowflake ID Decoder →

The Problem: IDs at Scale

Auto-Increment Doesn't Scale

Traditional database IDs:

Multiple servers need IDs

↓

All ask the same database

↓

Database becomes bottleneck

↓

Single point of failure!

Problem: Can't scale horizontally. One DB goes down, no more IDs.

Snowflake IDs Scale

Distributed ID generation:

Multiple servers need IDs

↓

Each server generates its own

↓

No coordination needed

↓

IDs still unique!

Result: No bottleneck. No single point of failure. Infinite scale.

Snowflake ID Structure

A Snowflake ID is 64 bits split into parts:

01 bit

Timestamp41 bits

Machine10 bits

Sequence12 bits

Timestamp (41 bits): Milliseconds since a custom epoch. Gives you ~69 years of IDs.
Machine ID (10 bits): Which server generated this. Supports up to 1024 machines.
Sequence (12 bits): Counter for IDs in the same millisecond. Up to 4096 IDs per ms per machine.

💡 Math: 1024 machines × 4096 IDs/ms = 4 million IDs per millisecond across your cluster.

How IDs Are Generated


sequenceDiagram
    participant App
    participant Generator as ID Generator
    participant Clock as System Clock
    
    App->>Generator: Need a new ID
    Generator->>Clock: What time is it?
    Clock-->>Generator: 1702684800000 ms
    
    Note over Generator: Same millisecond as last ID?
    
    alt Same millisecond
        Generator->>Generator: Increment sequence (0→1→2...)
    else New millisecond
        Generator->>Generator: Reset sequence to 0
    end
    
    Note over Generator: Combine: timestamp + machine_id + sequence
    Generator-->>App: 1541815603606036480

Why Snowflake IDs?

Time-Sortable

Since timestamp is the first part, IDs naturally sort in chronological order. Newer records have bigger IDs. Great for "show me recent items" queries.

No Coordination

Servers don't need to talk to each other or a central service. Each one generates IDs on its own. No network calls, no waiting.

Compact

64 bits fits in a single long integer. Way smaller than UUIDs (128 bits) and works as a primary key in any database.

Embedded Timestamp

You can extract when an ID was created just by looking at it. No need to store a separate created_at field. Try decoder tool →

Distributed Generation


graph TB
    subgraph Datacenter
        S1[Server 1 - Machine ID: 001] --> ID1[ID: timestamp-001-seq]
        S2[Server 2 - Machine ID: 002] --> ID2[ID: timestamp-002-seq]
        S3[Server 3 - Machine ID: 003] --> ID3[ID: timestamp-003-seq]
    end
    
    ID1 --> DB[(Database)]
    ID2 --> DB
    ID3 --> DB
    
    style S1 fill:#e0f2fe,stroke:#0369a1,stroke-width:2px
    style S2 fill:#e0f2fe,stroke:#0369a1,stroke-width:2px
    style S3 fill:#e0f2fe,stroke:#0369a1,stroke-width:2px
    style DB fill:#fef3c7,stroke:#f59e0b,stroke-width:2px

Each server has a unique machine ID. Even if they generate IDs at the exact same millisecond, the machine ID makes them different.

Who Uses Snowflake IDs?

Twitter: Created the original Snowflake. Every tweet ID is a Snowflake.
Discord: Uses Snowflakes for message IDs, user IDs, server IDs - everything.
Instagram: Modified version for photo IDs with sharding built in.
Sony: Uses Snowflake-style IDs for PlayStation Network.

Snowflake vs UUID

	Snowflake ID	UUID
Size	64 bits (8 bytes)	128 bits (16 bytes)
Sortable	✅ Yes, by time	❌ Random order
Setup	Need to assign machine IDs	Zero config
Index perf	Great (sequential)	Poor (random)

⚠️ Clock drift: Snowflakes depend on system clocks. If a server's clock goes backward, you could get duplicate IDs. Most implementations handle this by waiting or throwing an error.