Reading path · 7 stops · ~110 min

Data-intensive systems

Name: SystemRound
Availability: InStock

Storage choice, partitioning, replication, indexing. Everything it takes to make data layer decisions that survive review.

For: Engineers prepping for data-heavy prompts (search, analytics, feeds, storage products)

After this path

Name the right store, the right partition key, the right indexes, and the right replication mode — and defend each.

1
Skill
Data model design
Entities, relationships, keys, normalization vs denormalization.
Why this, here: The entity model decides every downstream choice.
2
Skill
Storage choice justification
Picking SQL vs KV vs doc vs blob vs timeseries based on access patterns.
Why this, here: SQL vs KV vs wide-column vs search — the framework for picking.
3
Skill
Sharding & partitioning
Partition key selection, hot spots, rebalancing, consistent hashing.
Why this, here: The single most consequential call in distributed data.
Checkpoint
Defend a partition key for a multi-tenant SaaS’s events table. Now describe the hot partition that eventually appears and how you’d reshard without downtime. If the reshard story is blank, the call wasn’t load-bearing yet.
4
Deep dive
Indexing strategies
B-tree, LSM, inverted, compound, and geo indexes tied back to access patterns.
Why this, here: B-tree vs LSM vs inverted. First-principles choice.
5
Skill
Replication & durability
Leader/follower, sync vs async replication, write quorum, RPO/RTO.
Why this, here: Quorum, sync vs async, RPO / RTO.
Checkpoint
For a payments store: sync replication to N replicas or async to 1? State the RPO you accept and the failure that forces the trade-off. Staff-plus candidates name the number.
6
Deep dive
CDC and eventing
Outbox, change-data-capture, derived views, dual-write avoidance, and replay safety.
Why this, here: Keep derived stores in sync without dual writes.
7
Pattern
Search over content
Inverted index + ranking service. The hard part isn't indexing — it's relevance, freshness, and a rebuild path.
Why this, here: The canonical derived-view pattern.

Data model design

Storage choice justification

Sharding & partitioning

Indexing strategies

Replication & durability

CDC and eventing

Search over content