What is a Snowflake ID?

A Snowflake ID is a 64-bit unique identifier used in distributed systems. It combines a 41-bit timestamp, 10-bit machine ID, and 12-bit sequence number. This structure allows multiple servers to generate unique IDs independently without coordination, while keeping IDs time-sortable. Twitter created this approach in 2010.

What is the Discord Snowflake ID length in digits?

Discord Snowflake IDs are 64-bit integers, which means they can be up to 19 digits long when represented as decimal numbers. For example, a Discord user ID like 123456789012345678 is 18 digits. The exact length varies based on when the ID was generated since older IDs have smaller timestamps.

How do I extract the timestamp from a Twitter Snowflake ID?

To extract the timestamp from a Twitter Snowflake ID, right-shift the ID by 22 bits to get the milliseconds since Twitter's epoch (1288834974657). Then add Twitter's epoch to get the Unix timestamp. In JavaScript: new Date((snowflakeId >> 22n) + 1288834974657n). Discord uses epoch 1420070400000 (January 1, 2015).

Why use Snowflake IDs instead of UUIDs?

Snowflake IDs are smaller (64-bit vs 128-bit), time-sortable for chronological queries, and more efficient as database primary keys. UUIDs are random, causing poor B-tree index performance and page splits. Snowflake IDs maintain locality of reference, making database writes faster.

How do I generate Snowflake IDs in Java?

For Java, use libraries like callicoder/java-snowflake or phxql/snowflake-id. These handle timestamp generation, machine ID assignment, and sequence number management. Example: SnowflakeIdGenerator generator = new SnowflakeIdGenerator(machineId); long id = generator.nextId();

How Snowflake IDs Work

Every tweet you post, every Discord message you send, every Instagram photo you upload gets a unique ID. At Twitter’s scale, that’s over 400 million tweets per day. Each needs an ID that’s guaranteed unique across thousands of servers. They can’t just ask a central database “what’s the next number?” because that database would melt under the load.

So in 2010, Twitter’s engineering team built Snowflake. It’s a system that lets any server generate unique IDs on its own, with no coordination, no locking, no single point of failure. And the IDs are sortable by time.

Let’s break down how this works and how you can use it.

The Problem With Traditional ID Generation

Before we get into Snowflake, let’s understand why generating IDs at scale is hard.

Option 1: Auto-increment in your database

Simple and works great for small apps. But at scale:

Your database becomes a bottleneck
Every insert requires a round trip to get the next ID
If that database goes down, you can’t create anything
Sharding becomes a nightmare

Option 2: Random UUIDs

Generate a 128-bit random number. Problem solved, right? Not quite:

UUIDs are big (36 characters as strings, 16 bytes binary)
They’re not sortable by time
Random distribution kills B-tree index performance
Your database does more page splits and uses more storage

Option 3: Snowflake IDs

The best of both worlds. Unique like UUIDs, but smaller and time-sortable.

Snowflake ID Structure: The 64-Bit Breakdown

A Snowflake ID packs three pieces of information into 64 bits:

Sign

1 bit

Timestamp

41 bits

Machine ID

10 bits

Sequence

12 bits

Here’s what each part does:

Component	Bits	Purpose	Capacity
Sign bit	1	Always 0 (keeps ID positive)	-
Timestamp	41	Milliseconds since custom epoch	~69 years
Machine ID	10	Which server generated this	1024 machines
Sequence	12	Counter for same-millisecond IDs	4096 per ms

The math works out nicely:

1024 machines x 4096 IDs per millisecond = 4+ million IDs per millisecond across your cluster
That’s over 4 billion IDs per second if you need them

How Snowflake ID Generation Works

When a server needs to generate an ID, it follows this process:


sequenceDiagram
    participant App as Application
    participant Gen as ID Generator
    participant Clock as System Clock
    
    App->>Gen: Generate new ID
    Gen->>Clock: Get current time
    Clock-->>Gen: 1705234567890 ms
    
    Gen->>Gen: Compare with last timestamp
    
    alt Same millisecond as last ID
        Gen->>Gen: Increment sequence (0, 1, 2...)
        Note over Gen: If sequence hits 4096,
wait for next millisecond
    else New millisecond
        Gen->>Gen: Reset sequence to 0
    end
    
    Gen->>Gen: Combine bits:
timestamp | machine_id | sequence
    Gen-->>App: Return 7155791234567890123

The key insight is that no coordination is needed between servers. Each server:

Knows its own machine ID (assigned at startup)
Can read its own clock
Tracks its own sequence number

As long as machine IDs are unique across your cluster, the generated IDs are guaranteed unique.

Twitter vs Discord: Different Epochs, Same Idea

Both Twitter and Discord use Snowflake IDs, but they chose different starting points (epochs):

Platform	Epoch	Epoch Date
Twitter	1288834974657	Nov 4, 2010
Discord	1420070400000	Jan 1, 2015

Why does the epoch matter?

The 41-bit timestamp stores milliseconds since the epoch. Starting from a later date means:

Smaller initial IDs (fits in fewer digits)
More years before the timestamp bits run out

Twitter’s epoch is their launch date. Discord picked the start of 2015 when they were building their system.

Discord Snowflake ID length in digits:

Discord Snowflake IDs typically range from 17 to 19 digits when displayed as decimal numbers. For example:

An early Discord ID: 81384788765712384 (17 digits)
A recent Discord ID: 1234567890123456789 (19 digits)

The exact length depends on when the ID was created. Older accounts have shorter IDs.

Extracting the Timestamp: Decoding Snowflake IDs

One of the most useful features of Snowflake IDs is that you can extract the creation time just by looking at the ID. No database lookup required.

Snowflake ID Decoder — Decode Discord, Twitter & Instagram IDs instantly Try it →

For Discord IDs (JavaScript):

function getDiscordTimestamp(snowflakeId) {
    // Discord epoch: January 1, 2015
    const DISCORD_EPOCH = 1420070400000n;
    
    // Convert to BigInt if needed (IDs can exceed JS number precision)
    const id = BigInt(snowflakeId);
    
    // Right-shift by 22 bits to get timestamp
    const timestamp = (id >> 22n) + DISCORD_EPOCH;
    
    return new Date(Number(timestamp));
}

// Example: Get when a Discord user account was created
const userId = "123456789012345678";
console.log(getDiscordTimestamp(userId));
// Output: 2017-03-15T12:34:56.789Z

For Twitter IDs (Python):

def get_twitter_timestamp(snowflake_id):
    # Twitter epoch: November 4, 2010
    TWITTER_EPOCH = 1288834974657
    
    # Right-shift by 22 bits to get timestamp offset
    timestamp_ms = (snowflake_id >> 22) + TWITTER_EPOCH
    
    from datetime import datetime
    return datetime.fromtimestamp(timestamp_ms / 1000)

# Example: Get when a tweet was posted
tweet_id = 1541815603606036480
print(get_twitter_timestamp(tweet_id))
# Output: 2022-06-28 15:23:45.123000

This is incredibly useful for:

Sorting content by creation time without a separate timestamp column
Debugging (when was this record created?)
Time-based sharding and partitioning
Rate limiting by creation time

Quick tip: You can also use Snowflake ID Decoder tool to decode any Discord, Twitter, or Instagram ID instantly without writing code.

Snowflake ID Generation in Java

For production Java applications, use a battle-tested library. Here’s a simple implementation to understand the concepts:

public class SnowflakeIdGenerator {
    
    private static final long EPOCH = 1609459200000L; // Jan 1, 2021
    private static final long MACHINE_ID_BITS = 10L;
    private static final long SEQUENCE_BITS = 12L;
    
    private static final long MAX_MACHINE_ID = ~(-1L << MACHINE_ID_BITS);
    private static final long MAX_SEQUENCE = ~(-1L << SEQUENCE_BITS);
    
    private static final long MACHINE_ID_SHIFT = SEQUENCE_BITS;
    private static final long TIMESTAMP_SHIFT = SEQUENCE_BITS + MACHINE_ID_BITS;
    
    private final long machineId;
    private long lastTimestamp = -1L;
    private long sequence = 0L;
    
    public SnowflakeIdGenerator(long machineId) {
        if (machineId < 0 || machineId > MAX_MACHINE_ID) {
            throw new IllegalArgumentException(
                "Machine ID must be between 0 and " + MAX_MACHINE_ID);
        }
        this.machineId = machineId;
    }
    
    public synchronized long nextId() {
        long currentTimestamp = System.currentTimeMillis();
        
        if (currentTimestamp < lastTimestamp) {
            throw new RuntimeException("Clock moved backwards!");
        }
        
        if (currentTimestamp == lastTimestamp) {
            sequence = (sequence + 1) & MAX_SEQUENCE;
            if (sequence == 0) {
                // Sequence exhausted, wait for next millisecond
                currentTimestamp = waitNextMillis(lastTimestamp);
            }
        } else {
            sequence = 0L;
        }
        
        lastTimestamp = currentTimestamp;
        
        return ((currentTimestamp - EPOCH) << TIMESTAMP_SHIFT)
                | (machineId << MACHINE_ID_SHIFT)
                | sequence;
    }
    
    private long waitNextMillis(long lastTimestamp) {
        long timestamp = System.currentTimeMillis();
        while (timestamp <= lastTimestamp) {
            timestamp = System.currentTimeMillis();
        }
        return timestamp;
    }
}

Using it:

// Each server gets a unique machine ID (0-1023)
SnowflakeIdGenerator generator = new SnowflakeIdGenerator(42);

// Generate IDs
long orderId = generator.nextId();
long userId = generator.nextId();

System.out.println("Order ID: " + orderId);  // 7155791234567890123
System.out.println("User ID: " + userId);    // 7155791234567890124

Production-ready libraries for Java:

callicoder/java-snowflake - Clean implementation with good documentation
phxql/snowflake-id - Customizable bit allocation
HMWCS/java-snowflake-id-generator - Optimized with CAS for high throughput

Snowflake ID vs UUID: When to Use Which

Here’s a direct comparison:

Feature	Snowflake ID	UUID v4
Size	64 bits (8 bytes)	128 bits (16 bytes)
As string	19 digits max	36 characters
Time-sortable	Yes	No
Index performance	Excellent	Poor
Coordination needed	Machine ID only	None
Collision risk	Zero (with proper setup)	Near zero
Extract creation time	Yes	No
JavaScript safe	Need BigInt	String representation

Choose Snowflake IDs when:

You need IDs sorted by creation time
Database index performance matters
You’re building a distributed system with known nodes
Storage size is a concern
You want to extract timestamps from IDs

Choose UUIDs when:

You need true decentralization (no machine ID coordination)
Working with systems that expect UUIDs
Simpler setup is more important than performance
You’re using UUID-native databases like PostgreSQL

Need UUIDs? Use UUID Generator to generate UUID v4 (random) or UUID v7 (timestamp-based, time-sortable like Snowflake IDs but in UUID format).

The Clock Skew Problem

Snowflake IDs depend on accurate system clocks. What happens if a server’s clock goes backward?

The problem:

Server time: 1705234567890
Generate ID with timestamp: 1705234567890

Clock drifts backward (NTP sync, manual change, etc.)

Server time: 1705234567880
Generate ID with timestamp: 1705234567880  <- This could duplicate an old ID!

How to handle it:

Refuse to generate - Most implementations throw an error if the clock goes backward. Your monitoring should catch this.
Wait it out - If the clock jumped back by a small amount (< 5ms), just wait until the clock catches up.
Use logical clocks - Track the last timestamp and never go backward, even if the system clock does.

Here’s a defensive implementation:

public synchronized long nextId() {
    long currentTimestamp = System.currentTimeMillis();
    
    if (currentTimestamp < lastTimestamp) {
        long offset = lastTimestamp - currentTimestamp;
        if (offset < 5) {
            // Small drift, wait it out
            try {
                Thread.sleep(offset);
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
            }
            currentTimestamp = System.currentTimeMillis();
        } else {
            // Large drift, refuse to generate
            throw new ClockMovedBackException(
                "Clock moved backward by " + offset + "ms");
        }
    }
    
    // Continue with normal generation...
}

Best practices:

Use NTP with slew mode (gradual adjustment) instead of step mode
Monitor clock drift across your servers
Set up alerts for clock skew > 1 second
Consider using a distributed time service like Spanner’s TrueTime

Machine ID Assignment in the Cloud

In containerized environments, assigning unique machine IDs gets tricky. Containers come and go. Auto-scaling adds new instances.

Option 1: Use a coordination service

Store machine ID assignments in ZooKeeper, etcd, or Consul:

public class ZookeeperMachineId {
    private static final String MACHINE_ID_PATH = "/snowflake/machine-ids";
    
    public int acquireMachineId(CuratorFramework client) throws Exception {
        // Create ephemeral sequential node
        String path = client.create()
            .withMode(CreateMode.EPHEMERAL_SEQUENTIAL)
            .forPath(MACHINE_ID_PATH + "/machine-");
        
        // Extract sequence number from path
        String sequenceStr = path.substring(path.lastIndexOf("-") + 1);
        int machineId = Integer.parseInt(sequenceStr) % 1024;
        
        return machineId;
    }
}

When the container dies, the ephemeral node disappears, freeing the ID.

Option 2: Hash the container identity

Derive machine ID from something unique about the container:

public int getMachineIdFromEnvironment() {
    // Use pod name in Kubernetes
    String podName = System.getenv("HOSTNAME");
    if (podName != null) {
        return Math.abs(podName.hashCode()) % 1024;
    }
    
    // Fallback: use IP address
    try {
        InetAddress addr = InetAddress.getLocalHost();
        byte[] bytes = addr.getAddress();
        return ((bytes[2] & 0xFF) << 2) | (bytes[3] & 0x03);
    } catch (UnknownHostException e) {
        throw new RuntimeException("Cannot determine machine ID", e);
    }
}

Option 3: Use a central ID service

Run a dedicated service that generates Snowflake IDs and exposes them via API. The service handles all the complexity; your apps just call it.


graph TB
    subgraph "Application Servers"
        A1[App Server 1]
        A2[App Server 2]
        A3[App Server 3]
    end
    
    subgraph "ID Service Cluster"
        ID1[ID Service
Machine 1]
        ID2[ID Service
Machine 2]
        ID3[ID Service
Machine 3]
    end
    
    A1 --> ID1
    A2 --> ID2
    A3 --> ID3
    
    style ID1 fill:#dcfce7,stroke:#16a34a,stroke-width:2px
    style ID2 fill:#dcfce7,stroke:#16a34a,stroke-width:2px
    style ID3 fill:#dcfce7,stroke:#16a34a,stroke-width:2px

This adds a network hop but centralizes the machine ID management problem.

Real-World Implementations

Here’s how major companies use Snowflake-style IDs:

Twitter (Original Snowflake)

41 bits timestamp, 10 bits machine ID, 12 bits sequence
Open-sourced in 2010, later deprecated their implementation
Still used for tweet IDs

Discord

Same structure as Twitter but different epoch (2015)
Used for everything: users, servers, channels, messages
Exposes timestamp extraction in their API docs

Instagram

Modified structure: 41 bits timestamp, 13 bits shard ID, 10 bits sequence
Shard ID helps with database partitioning
Implemented as PostgreSQL stored procedures

Sony (Sonyflake)

Optimized for fewer machines, more sequence bits
39 bits timestamp, 8 bits sequence, 16 bits machine ID
Useful when you have more machines than sequence demand

Database Performance Impact

Using Snowflake IDs as primary keys gives you significant performance benefits.

Why random UUIDs hurt performance:


graph TB
    subgraph "UUID v4 Insert Pattern"
        U1["Insert UUID abc..."] --> P1["Page 42"]
        U2["Insert UUID xyz..."] --> P2["Page 891"]
        U3["Insert UUID def..."] --> P3["Page 7"]
        U4["Insert UUID mno..."] --> P4["Page 523"]
    end
    
    subgraph "Snowflake Insert Pattern"
        S1["Insert ID 7001"] --> Q1["Page 100"]
        S2["Insert ID 7002"] --> Q1
        S3["Insert ID 7003"] --> Q1
        S4["Insert ID 7004"] --> Q1
    end
    
    style P1 fill:#fee2e2,stroke:#dc2626
    style P2 fill:#fee2e2,stroke:#dc2626
    style P3 fill:#fee2e2,stroke:#dc2626
    style P4 fill:#fee2e2,stroke:#dc2626
    style Q1 fill:#dcfce7,stroke:#16a34a

Random UUIDs scatter inserts across the entire index, causing:

More disk I/O (different pages for each insert)
Poor cache utilization
Frequent page splits
Index fragmentation

Snowflake IDs insert sequentially (roughly), keeping recent data together:

Writes hit the same pages repeatedly
Better cache hit rates
Fewer page splits
Natural time-based partitioning

Benchmark results (your mileage may vary):

Metric	UUID v4	Snowflake ID
Insert throughput	~15,000/sec	~45,000/sec
Index size	100%	~65%
Point lookup	~2ms	~1ms
Range query (recent)	~15ms	~3ms

Key Takeaways

1. Snowflake IDs solve the distributed ID problem elegantly. No coordination needed between servers, no single point of failure, and IDs are guaranteed unique.

2. The 64-bit structure is carefully designed. 41 bits for timestamp (~69 years), 10 bits for machine ID (1024 nodes), 12 bits for sequence (4096 IDs per millisecond per node).

3. Time-sortability is a major advantage. Sort by ID to sort by creation time. Extract timestamps without database lookups. Enable efficient range queries on recent data.

4. Discord Snowflake IDs are 17-19 digits. The exact length depends on when the ID was created. Use BigInt in JavaScript to handle them safely.

5. Clock synchronization matters. Use NTP, monitor for drift, and handle backward clock jumps gracefully.

6. Machine ID assignment needs thought in containers. Use coordination services, hash container identity, or run a central ID service.

7. Database performance improves significantly. Sequential inserts, better cache utilization, smaller indexes compared to UUIDs.

Snowflake IDs aren’t the answer for every system. But when you’re building something that needs to scale, handle billions of records, and maintain performance, they’re worth understanding.

Want to learn more about distributed systems? Check out How Stripe Prevents Double Payment for idempotency patterns, Long Polling Explained for real-time communication, and Kubernetes Architecture Explained for container orchestration at scale. Try our Snowflake ID Decoder and UUID Generator tools.

References: Twitter Snowflake (GitHub Archive), Discord Developer Docs, Instagram Engineering Blog