CDC and eventing
Outbox, change-data-capture, derived views, dual-write avoidance, and replay safety.
Change-data-capture is how you keep read models fresh without dual writes. The DB's log is already your most reliable event stream — use it.
Read this if your last attempt…
- You have a DB and a search index / cache / analytics warehouse that need to stay in sync
- You wrote "we'll dual-write to the DB and to Kafka" and moved on
- You don't know what Debezium is
- You confuse CDC with event sourcing
The concept
Change-data-capture (CDC) means tailing the database's transaction log (WAL in Postgres, binlog in MySQL) and emitting a stream of change events to downstream consumers. Every committed change becomes an event; nothing else does. This solves the dual-write problem (DB + Kafka in separate writes can drift).
Three ways to get events out of a DB:
The DB's transaction log is the source of truth for "what changed". Everything downstream builds on it.
Getting events out of a DB.
| Pattern | Correctness | Cost |
|---|---|---|
| Dual write | Broken (race condition) | Low until it fails |
| Outbox (DIY) | Correct | One extra table + a polling/tailing worker |
| CDC via Debezium | Correct | CDC infrastructure + Kafka |
| Event sourcing | Correct, different shape | High — rearchitects the write path entirely |
How interviewers grade this
- You never describe a "dual write to DB and Kafka". Ever.
- You name outbox or CDC as the mechanism.
- You distinguish CDC (DB first) from event sourcing (events first).
- Your downstream consumers (search, cache, warehouse) build from the same event stream.
- Consumers are idempotent; offsets enable replay after bugs.
Variants
Outbox pattern
DB-side: extra `outbox` table written in the same tx. Tailed by a worker.
No CDC product needed. Service owns the publishing. Simple, robust, great for single-service event emission. Loses schema flexibility — app has to write correct event payloads.
Pros
- +No new infrastructure
- +App controls event schema
- +Trivial to reason about
Cons
- −Each service implements it separately
- −Requires a tailer per service
- −Polling has a latency floor (~1 s)
Choose this variant when
- Single service emitting events
- Existing Kafka or similar
- Modest number of event types
Debezium (log-tailing CDC)
Reads DB WAL; emits every committed change as a Kafka event. Schema-mirroring.
The industrial-strength answer. Every DB change, exactly once, in order per key. Downsides: infrastructure to run, event schema mirrors DB schema (sometimes too-low-level).
Pros
- +Faithful to every DB change
- +No app-side outbox work
- +Works across many DBs (Postgres, MySQL, Mongo)
Cons
- −New infrastructure (Kafka Connect, ZK/KRaft)
- −Events mirror tables — might want business-level events
- −Schema evolution across DB migrations requires care
Choose this variant when
- Many services emitting events
- Existing Kafka infra
- Warehouse / search index rebuild flows
Event sourcing
Events are the source of truth; DB state is a projection.
Heaviest commitment. Every mutation is an event persisted first; state is rebuilt from the event log. Powerful for audit and replay; expensive in developer cognitive load.
Pros
- +Full history is native
- +Replay / time-travel debugging
- +Natural audit log
Cons
- −Schema evolution is a whole discipline
- −Every developer on the team pays the cognitive tax
- −Harder to reason about "current state"
Choose this variant when
- Core domain with regulatory history requirements
- When the audit trail is the product (finance, health)
Worked example
Scenario: an e-commerce platform. orders table is primary. Search index, analytics warehouse, and cache all need to reflect changes.
Pipeline:
- 1App writes to Postgres
orderstable (one write, one transaction). - 2Debezium tails the Postgres WAL, publishes
orders.changeevents to Kafka (partitioned by order_id). - 3Consumer groups:
- Search-index consumer: upserts into Elasticsearch. - Warehouse consumer: appends to a ClickHouse fact table. - Cache consumer: invalidates or refreshes Redis entries by order_id.
Handling a bug in the search-index consumer:
- Fix the consumer code.
- Reset the consumer-group offset to the last-known-good (e.g. yesterday).
- Replay. Search index rebuilds. No DB touched; no other consumer affected.
Cold-starting a new downstream (new warehouse table):
- Snapshot + incremental: Debezium takes a snapshot of the table, emits it as events, then starts tailing the WAL from the snapshot LSN. New consumer replays from the snapshot offset.
What we do NOT do:
- App does not publish to Kafka directly. App writes one transaction to one DB. That's it.
Good vs bad answer
Interviewer probe
“Your DB is the source of truth but you need the search index and cache to stay fresh. How?”
Weak answer
"After each DB write, the app also publishes to Kafka."
Strong answer
"Never dual-write — that's a drift bug waiting to happen. Instead, the app writes one transaction to Postgres. Debezium tails the WAL and publishes each committed change to Kafka. The search indexer and cache invalidator are two separate consumer groups on that topic. Consumers are idempotent. If a consumer bug corrupts the index, we fix the code, reset its offset, and replay — the DB is untouched. If we don't want the infra cost of Debezium, the outbox pattern — outbox table written in the same tx, tailed by a worker — gets us 80% of the value."
Why it wins: Rejects the dual-write anti-pattern, names the two viable options (CDC / outbox), and highlights replay as the operational win.
When it comes up
- A search index, cache, or warehouse must stay in sync with the DB
- The interviewer asks how derived read models stay fresh
- You are integrating microservices via events
- Anyone proposes "write to the DB and publish to Kafka"
Order of reveal
- 11. Reject dual-write. I never write to the DB and the broker in two separate operations — a crash between them drifts the two stores. That is the bug to design out from the start.
- 22. Outbox or CDC. Either an outbox table written in the same transaction and tailed by a worker, or log-tailing CDC (Debezium) reading the WAL. Both publish iff the commit happened.
- 33. DB stays the truth. Search, cache, and warehouse are derived views built from the event stream — never the source of truth.
- 44. Idempotent + replayable. Consumers are idempotent, and I can rebuild a broken downstream by resetting the consumer offset and replaying.
- 55. Project to business events. If raw row-level CDC leaks schema, I project to business events (OrderPlaced) in a stream processor so consumers do not couple to columns.
Signature phrases
- “Never dual-write to the DB and the broker — that is a drift bug waiting to happen.” — Names the anti-pattern interviewers want you to avoid.
- “The DB's transaction log is already my most reliable event stream.” — Captures the core insight of CDC in one line.
- “Downstreams are derived views I can rebuild by resetting the offset.” — Highlights replay as the operational superpower.
- “Outbox if I want zero new infra; Debezium if many services emit events.” — Shows you match the mechanism to the team’s scale.
Likely follow-ups
?“Outbox or Debezium — how do you choose?”Reveal
Outbox when a single service emits events and you do not want new infrastructure: an extra table written in the same transaction, tailed by a worker. Debezium (CDC) when many services emit events and you already run Kafka: it tails the WAL with no app-side changes and captures every commit faithfully. Outbox gives you ~80% of the value with a fraction of the operational footprint; CDC wins at fleet scale.
?“How is CDC different from event sourcing?”Reveal
CDC is DB-first, events-derived: rows are the source of truth and the change log is generated from commits. Event sourcing is events-first, DB-derived: the event log is the source of truth and current state is a projection you rebuild. Most systems should do CDC — event sourcing is a much bigger commitment that taxes every developer with replay-and-projection thinking, justified mainly when the audit trail is the product.
?“Your raw CDC events couple every consumer to table columns. How do you decouple?”Reveal
Insert a translation layer: a stream processor (Kafka Streams / Flink) consumes the raw row-change events and emits business-level events — OrderPlaced, PriceChanged — with a stable schema. Consumers subscribe to those, not to the table shape. Now a DB migration that renames a column does not ripple into every downstream, because the projection absorbs it.
Common mistakes
Classic bug: two writes, two failure modes, eventual drift. Use outbox or CDC.
Downstream couples to every column change. Project to business-level events (OrderPlaced, PriceChanged) at the CDC layer or in a stream processor (Kafka Streams) to decouple schema.
If you can't replay the event stream, you can't rebuild a broken downstream. Keep Kafka retention long enough (days to weeks) to cover your worst-case recovery.
Event sourcing is a commitment. Most teams want CDC (DB-first, events-derived) — not event sourcing (events-first, DB-derived).
Practice drills
Explain outbox in 30 seconds.Reveal
In the same DB transaction as your business write, insert a row into an outbox table describing the event. A separate worker polls (or tails) outbox rows with status=pending, publishes them to Kafka, marks them sent. Atomicity: the business row and the outbox row commit together, so the event is published iff the business change committed.
Interviewer: "your search index is out of sync after a deploy. What now?"Reveal
Confirm with a spot-check. If the cause is a consumer bug deployed recently: roll back consumer, reset its Kafka offset to before the bug, replay. If the cause is a missed event (dual-write drift, for example): rebuild from the DB — take a snapshot, emit it as events, replay. Long-term: move to CDC so snapshots and incrementals are a built-in replay mechanism.
When would you pick event sourcing over CDC?Reveal
When the event log IS the product — regulated audit trails in finance, medical records, banking. When you need time-travel and cross-event analytical queries as a first-class thing. Otherwise, CDC gives you 90% of the operational benefit (replay, materialised views) at a fraction of the developer-cognitive cost.
Cheat sheet
- •Never dual-write. Outbox or CDC.
- •Outbox = app writes extra row in same tx; worker tails.
- •CDC (Debezium) = log tailing; every commit becomes an event.
- •Event sourcing ≠ CDC. Much bigger commitment.
- •Consumers are idempotent; replay via offset reset is the superpower.
- •Retain Kafka long enough to cover rebuild scenarios.
Practice this skill
No problem is tagged directly to CDC and eventing yet. These published problems still exercise the same interview category.
Read this if