Change Data Capture (CDC)
Keeping your cache in sync with your database
What is CDC?
Change Data Capture (CDC) is a way to track changes in your database and automatically update other systems - like your cache, search index, or analytics warehouse. Instead of constantly asking "did anything change?", CDC tells you "hey, this just changed!" the moment it happens.
Think of it like: Your database sending push notifications whenever data changes, so your cache always knows what to update.
The Problem: Stale Cache
Without CDC
Your cache gets out of sync with the database:
User updates email in database
↓
Cache still has old email
↓
App shows wrong data!
Solution? Manually clear cache or wait for it to expire. Slow and error-prone.
With CDC
Changes automatically sync to your cache:
User updates email in database
↓
CDC detects change instantly
↓
Cache updates automatically!
Result: Cache always has fresh data. No manual work needed.
How CDC Works
sequenceDiagram
participant App
participant Database
participant CDC
participant Cache
App->>Database: UPDATE user email
Database->>Database: Write to transaction log
Database-->>App: Success
Note over CDC: Continuously reads
transaction log
Database->>CDC: New change detected
CDC->>Cache: Update user email
Cache-->>CDC: Updated
Note over Cache: Cache now has
fresh data
App->>Cache: GET user
Cache-->>App: Returns updated email
Key Benefits
Real-Time Updates
Changes propagate instantly. Your cache, search indexes, and other systems update as soon as data changes - no delays.
Low Database Impact
CDC reads from transaction logs, not your actual tables. No extra queries hitting your database - it just quietly watches the log.
No Code Changes
Your app doesn't need to know CDC exists. It writes to the database normally, and CDC handles the rest behind the scenes.
Common CDC Methods
Transaction Log
Reads the database's built-in transaction log (WAL, binlog, etc.). Most efficient - zero impact on database performance.
Best for: Production systems
Triggers
Database triggers fire on INSERT/UPDATE/DELETE and write changes to a separate table. Simple but adds overhead to every write.
Use when: Can't access logs
Timestamp Polling
Periodically query for rows with updated_at > last_check. Easy to set up but not real-time and adds database load.
Avoid if: You need real-time data
Common Use Cases
- Cache Sync: Keep Redis or Memcached in sync with your database. User updates profile? Cache updates instantly.
- Search Indexing: Auto-update Elasticsearch when products change. No manual reindexing needed.
- Data Warehouse: Stream changes to your analytics database (Snowflake, BigQuery) for near real-time reporting.
- Microservices: Keep data synced across services. Orders service updates inventory? Warehouse service knows immediately.
- Event Streaming: Feed changes into Kafka for event-driven architectures and real-time processing.
CDC in Action
graph TD
A[PostgreSQL Database] --> B[CDC Tool
Debezium/Maxwell/etc]
B --> C[Redis Cache]
B --> D[Elasticsearch]
B --> E[Data Warehouse]
B --> F[Kafka Stream]
G[Your Application] --> A
G --> C
style A fill:#e0f2fe,stroke:#0369a1,stroke-width:2px
style B fill:#fef3c7,stroke:#f59e0b,stroke-width:3px
style C fill:#dcfce7,stroke:#16a34a,stroke-width:2px
style D fill:#dcfce7,stroke:#16a34a,stroke-width:2px
style E fill:#dcfce7,stroke:#16a34a,stroke-width:2px
style F fill:#dcfce7,stroke:#16a34a,stroke-width:2px
style G fill:#f3e8ff,stroke:#a855f7,stroke-width:2px
CDC sits between your database and everything else, automatically keeping them in sync
Popular CDC Tools
Debezium
Open-source, works with MySQL, PostgreSQL, MongoDB, SQL Server. Streams to Kafka. Industry standard.
Maxwell's Daemon
Simple CDC for MySQL. Outputs JSON. Great for getting started quickly.
AWS DMS / Fivetran
Managed CDC services. Less setup but costs money. Good for enterprise.
When to Use CDC
Use CDC When
- You need real-time data sync
- Multiple systems need the same data
- Your cache keeps getting stale
- Building event-driven architecture
- Database handles lots of writes
- Manual sync is too error-prone
Skip CDC When
- Data rarely changes
- Eventual consistency is fine
- Only one system uses the data
- Simple app with no cache
- Can't access database logs
- Team too small for complexity