Technology·Databases

DynamoDB

A fully-managed, horizontally-scaling key-value and document store with single-digit-millisecond latency at any scale — when your access patterns are known and key-based, it removes sharding and capacity planning entirely.

Also worth naming: Amazon DynamoDB · DynamoDB Global Tables (multi-region) · DAX (in-memory accelerator)

~25 min read·15 sections

DynamoDB trades the flexibility of SQL for one superpower: predictable performance at unlimited scale with no servers to manage. The catch is that you must design the table around your queries up front — you model access patterns, not entities.

What it is

DynamoDB is AWS's managed NoSQL database — a key-value and document store that scales horizontally and automatically, with no nodes, shards, or capacity planning for you to operate. You get single-digit-millisecond reads and writes whether the table holds a megabyte or a petabyte, because DynamoDB transparently spreads data across internal partitions and adds more as you grow.

Every item lives under a primary key: either a simple partition key, or a composite of a partition key + a sort key. The partition key is hashed to decide which partition stores the item; the sort key orders items within a partition. That structure is the whole game — it makes "get this exact item" and "get all items for this partition key, ordered/ranged by sort key" extremely fast, and makes anything else (arbitrary filters, joins, ad-hoc queries) slow or impossible without extra indexes.

The interview framing: DynamoDB is the right call when the access patterns are known, key-based, and high-scale — a URL shortener's lookups, a user's session store, a shopping cart, a feed keyed by user. It is the wrong call when you need rich ad-hoc queries, joins, or strong multi-row transactions across a large relational model — that is Postgres. You don't choose DynamoDB to avoid SQL; you choose it because your queries are simple and your scale is large.

When to reach for it

Reach for this when…

Access patterns are known up front and are key-based (get/put by id, query by partition key)
You need massive, elastic scale with predictable low latency and zero ops
Write-heavy or spiky workloads where managing your own sharding would be painful
Simple item shapes: sessions, carts, user profiles, feeds, event records, idempotency keys

Not really this pattern when…

You need ad-hoc queries, joins, or aggregations across the data (that is SQL / Postgres)
Access patterns are unknown or will change a lot — you cannot pre-design the keys
You need rich multi-row ACID transactions across a complex relational model
Full-text search or analytics — pair with Elasticsearch / a warehouse instead

How it works

Three ideas explain how to use DynamoDB well:

1. The partition key decides everything. DynamoDB hashes the partition key to place an item, so items with the same partition key live together and are retrieved together. Choose a partition key that (a) matches your dominant query and (b) spreads load evenly. A low-cardinality or skewed key creates a hot partition — one partition taking disproportionate traffic — which throttles regardless of total table capacity.

Architecture diagram· The partition key hashes to a partition; the sort key orders within it

DynamoDB hashes the partition key to place an item on one of many internal partitions. Items that share a partition key are stored together, ordered by sort key — which is what makes "all items for this user, newest first" a single fast query.

2. You model access patterns, not entities. In SQL you normalise and join at read time. In DynamoDB you do the opposite: figure out every query you need, then design keys (and often a single table with overloaded keys) so each query is a single GetItem or Query. Denormalisation and duplication are expected — storage is cheap, and the goal is one fast lookup per access pattern.

3. Secondary indexes buy you more access patterns — at a cost. A Global Secondary Index (GSI) re-partitions the same items under a different key so you can query a second pattern (e.g. look up a user by email when the table is keyed by user_id). A GSI is a separate, asynchronously-maintained copy with its own capacity and eventual consistency — powerful, but not free.

Architecture diagram· A Global Secondary Index is a separate table on a different key

A GSI re-partitions the same items under a new partition key so you can query a second access pattern. It is maintained asynchronously (eventually consistent) and has its own provisioned capacity — it is a copy, not a free index.

Consistency and capacity round it out: reads are eventually consistent by default (cheaper, lower latency) or strongly consistent on request (within a region). Capacity is on-demand (pay per request, auto-scales — the usual interview default) or provisioned (you set RCU/WCU, cheaper at steady predictable load).

Performance envelope

DynamoDB performance envelope — the numbers to quote.

Dimension	Number	Why it matters
Latency	Single-digit ms (sub-ms with DAX)	Consistent regardless of table size — the headline feature
Scale	Effectively unlimited, automatic	No manual sharding; partitions split as data/throughput grows
Item size	Max 400 KB per item	Large blobs go to S3 with a pointer in the item
Per-partition limit	~3,000 RCU / 1,000 WCU	The hot-partition ceiling — why key distribution matters
Consistency	Eventual (default) or strong (per read, in-region)	Strong costs more capacity and slightly more latency
Capacity model	On-demand (auto) or provisioned (RCU/WCU)	On-demand for spiky/unknown; provisioned for steady cheap

Capabilities in interviews

Key-value & item lookups

Get or put a single item by primary key in single-digit milliseconds at any scale.

The bread-and-butter pattern. A URL shortener stores { short_code (PK), long_url, created_at } and every redirect is one GetItem on short_code — sub-10ms, whether you have a thousand or ten billion links:

text

PutItem  { PK: "abc123", long_url: "https://…" }
GetItem  { PK: "abc123" }   → one partition, one hop, ~5ms

No index tuning, no sharding, no capacity ceiling to plan. This is exactly where DynamoDB shines and a relational DB would need read replicas and a cache to match.

Choose this variant when

Lookups by a known id (short codes, session ids, user ids)
Idempotency-key storage
Anything that is fundamentally a hash-map at scale

Query by partition + sort key

Fetch a ranged, ordered slice of items that share a partition key in one call.

With a composite key, items under one partition key are ordered by sort key, so "latest N for this user" is one Query:

text

PK = user#42, SK = ts#<timestamp>
Query(PK="user#42", SK begins/between …, ScanIndexForward=false, Limit=20)

This serves feeds, message threads, time-series per device, and order history — any "items belonging to X, ordered by time/rank" pattern. Pagination uses the last evaluated key as a cursor. Model the sort key to encode the ordering and ranges you need.

Choose this variant when

Feeds / timelines keyed by user
Message or event history per entity
Time-series per device or per tenant

Secondary access patterns (GSI)

Add a Global Secondary Index to query the same items by a different key.

When you need a second way in — look up a user by email when the table is keyed by user_id — you add a GSI on email. DynamoDB maintains it asynchronously:

text

Base:  PK = user_id
GSI:   PK = email   → Query(email = "a@b.com")

Budget for it: a GSI is a separate copy with its own capacity, it is eventually consistent, and over-indexing multiplies write cost. Add indexes for real access patterns, not "just in case."

Choose this variant when

A second known query path on the same data
Looking items up by an alternate attribute
Sparse indexes (only items with the attribute are indexed)

Change streams & event-driven reactions

DynamoDB Streams emit every item change for downstream processing.

Every write can emit a change record to a DynamoDB Stream, which a Lambda or consumer reads to trigger side effects — update a search index, fan out a notification, maintain an aggregate, or replicate cross-region (Global Tables are built on this):

text

write item → Stream → Lambda → (index in Elasticsearch / send notification / update counter)

This turns the database into an event source without a separate CDC pipeline, and is how you keep derived stores in sync with a DynamoDB system of record.

Choose this variant when

Reacting to writes (search indexing, notifications, aggregates)
Cross-region replication via Global Tables
Outbox-style event emission from the data layer

Operating knobs

Partition key design

The single most important decision. Pick a high-cardinality key that spreads writes evenly and matches your dominant query. For naturally skewed keys (a celebrity, a hot tenant), write-shard by appending a suffix (user#42#3) and scatter-read across the suffixes. Get this wrong and you get throttled hot partitions no amount of table capacity fixes.

On-demand vs provisioned capacity

On-demand bills per request and absorbs spikes automatically — the right default for unknown or bursty load and the simplest interview answer. Provisioned (RCU/WCU with autoscaling) is cheaper for steady, predictable traffic. Provisioned without autoscaling risks throttling on spikes.

Single-table vs multi-table design

Advanced DynamoDB packs multiple entity types into one table with overloaded, generic keys (PK/SK) so related items co-locate and one query returns a whole object graph. It is powerful and minimises round-trips, but it is harder to evolve. For interviews, knowing it exists and why (co-location, fewer requests) is usually enough.

Consistency choice

Eventually consistent reads are the default — cheaper and lower latency, fine for most reads. Request strongly consistent reads only where you must read your own write immediately within a region; they cost more capacity and do not work on GSIs.

Versus the alternatives

DynamoDB vs the alternatives.

Dimension	DynamoDB	PostgreSQL	Cassandra
Model	Managed key-value / document	Relational, rich SQL	Self-managed wide-column
Scaling	Automatic, unlimited, zero-ops	Vertical + replicas; sharding is manual	Linear by adding nodes (you operate it)
Queries	Key-based; pre-designed access patterns	Ad-hoc joins, aggregations, filters	Key-based, partition-aware
Ops burden	None (fully managed)	Moderate (you run it / RDS)	High (you run the cluster)
Best for	Known key-access at scale, ops-light	Default for relational + transactions	Write-heavy at scale, multi-region, self-hosted

Failure modes & gotchas

Hot partition from a skewed key

A low-cardinality or skewed partition key concentrates traffic on one internal partition, which throttles at ~3,000 RCU / 1,000 WCU no matter how much capacity the table has. Choose a high-cardinality key; write-shard known-hot keys with a suffix and scatter-gather on read.

Designing for entities, then needing a Scan

If you model the table like SQL and later need a query the keys do not support, your only option is a full-table Scan — slow and expensive at scale. Enumerate the access patterns first and design keys (and GSIs) for each; a Scan in production is a design smell.

Assuming GSIs are strongly consistentAdvanced

GSIs are maintained asynchronously and are always eventually consistent — a read of a GSI right after a write may miss it. Do not put a GSI on a read-your-write-critical path; use the base table's strongly-consistent read for that.

The 400 KB item limitAdvanced

Items max out at 400 KB. Large payloads (images, documents, big blobs) belong in S3 with just a pointer stored in DynamoDB. Trying to stuff large values into items leads to throttling and cost blowups.

Over-indexing write costAdvanced

Every GSI is another copy that every write must update, multiplying write capacity consumption. Add indexes only for access patterns you actually serve; "just in case" indexes silently inflate cost.

In production

Amazon

Prime Day on DynamoDB — trillions of calls, single-digit-ms latency

DynamoDB is the database behind much of Amazon.com's own checkout, cart, and catalog, and Prime Day is its signature stress test. In a recent Prime Day, Amazon systems made trillions of API calls to DynamoDB, peaking at over 126 million requests per second, while maintaining single-digit-millisecond latency — with no capacity pre-warming dance, because the table scales transparently.

This is the literal embodiment of the interview pitch: the access patterns are known and key-based (cart by customer, item by id), the scale is enormous and extremely spiky, and the team operates zero sharding or capacity infrastructure. DynamoDB grew up inside Amazon precisely to remove the operational pain of scaling relational databases for these high-throughput, key-access workloads.

Takeaway: For known key-based access at extreme, spiky scale, DynamoDB delivers flat single-digit-ms latency with no sharding to operate — the exact problem it was built to remove.

Disney+

Launch-scale bookmarks and watchlists with no capacity guesswork

Disney+ stores per-user state — bookmarks (resume points), watchlists, viewing history — in DynamoDB, and famously launched to tens of millions of subscribers far faster than projected without a database scaling crisis. The access pattern is textbook single-table key access: partition by user_id, query a user's items by sort key, every operation a fast point lookup or query.

What made DynamoDB the right call was elastic, automatic scaling under an unpredictable launch curve — they could not have capacity-planned a launch that beat forecasts several times over, and on-demand capacity meant they didn't have to. The takeaway engineers cite is that modeling the access patterns up front (not the entities) and letting DynamoDB handle scale let a small team support a hyperscale launch.

Takeaway: Model access patterns up front and let on-demand capacity absorb an unpredictable launch — DynamoDB removes the capacity-planning gamble for key-access workloads.

Good vs bad answer

Interviewer probe

“For a URL shortener doing 10B redirects/month, what database stores the code → URL mapping?”

Weak answer

"A relational database like MySQL with the short code as the primary key. If reads get heavy I'll add read replicas and a cache in front."

Strong answer

"DynamoDB. The access pattern is dead simple and key-based — GetItem by short_code — and the scale is large and read-heavy, which is exactly DynamoDB's sweet spot: single-digit-millisecond lookups at any size with no sharding or capacity planning to operate. Table is { short_code (PK), long_url, created_at, owner }; every redirect is one hash lookup on a well-distributed key (the codes are random, so no hot partition). I'd use on-demand capacity so launch spikes auto-absorb, and put DAX or a CDN in front for the hottest codes to shave latency further. I would not reach for SQL here — there are no joins, no ad-hoc queries, nothing relational; I'd just be signing up to run replicas and shard a system DynamoDB scales for free. If we later needed 'all links by owner', that's a GSI on owner, accepting its eventual consistency."

Why it wins: Matches the engine to the access pattern (key-based, read-heavy, huge scale), names the key and why it distributes, picks the capacity mode with a reason, anticipates the second access pattern as a GSI with its caveat, and explicitly rejects SQL for the right reason rather than dogma.

Interview playbook

Interview playbook2–3 min on the data-store decision; deeper if the workload is the core of the problem

When it comes up

Massive scale with simple, known, key-based access patterns
Write-heavy or spiky workloads where you do not want to operate sharding
Sessions, carts, feeds, idempotency keys, user profiles at scale
The interviewer asks "SQL or NoSQL?" and the queries are genuinely key-based

Order of reveal

1
1. Justify by access pattern. The queries here are key-based and high-scale, so I model access patterns and use DynamoDB — not because I am avoiding SQL.
2
2. Name the keys. Partition key is X (high-cardinality, matches the dominant query); sort key encodes the ordering/ranges I need.
3
3. Call hot-partition risk. If a key can get hot, I write-shard it with a suffix and scatter-read; otherwise the key distributes evenly.
4
4. Capacity + consistency. On-demand capacity for spiky load; eventual reads by default, strong only where I read my own write.
5
5. Extra patterns via GSI / Streams. A second query path is a GSI (eventually consistent); reactions to writes go through DynamoDB Streams.

Signature phrases

“In DynamoDB you model access patterns, not entities.”

“The partition key must be high-cardinality and match the dominant query.”

“Single-digit-millisecond latency at any scale, zero sharding to operate.”

“A GSI is a separate, eventually-consistent copy — not a free index.”

“In DynamoDB you model access patterns, not entities.” — The defining mindset shift from SQL.
“The partition key must be high-cardinality and match the dominant query.” — Shows you know where hot partitions come from.
“Single-digit-millisecond latency at any scale, zero sharding to operate.” — Names the actual reason to pick it.
“A GSI is a separate, eventually-consistent copy — not a free index.” — Demonstrates you understand the cost.

Likely follow-ups

?“How do you handle a celebrity / hot key?”Reveal

Write-sharding: append a small random or bucketed suffix to the partition key (user#42#0 … user#42#9) so the writes spread across ten partitions instead of hammering one. On read you scatter-gather across the suffixes and merge. For hot reads specifically, front the item with DAX or an application cache so most reads never hit the partition at all. The goal is to keep any single partition under the ~3,000 RCU / 1,000 WCU ceiling.

?“You need a query the keys do not support. Options?”Reveal

Add a GSI on the attribute you want to query, which re-partitions the items under that new key — accepting that it is eventually consistent and consumes its own write capacity. If it is a one-off analytical query rather than an access pattern, export to S3 and query with Athena, or stream changes to a system built for ad-hoc queries. What I avoid is a production Scan, which reads the whole table. If I find myself wanting many ad-hoc queries, that is a signal the workload may not be a DynamoDB fit at all.

?“Strong vs eventual consistency here — which and why?”Reveal

Default to eventually consistent reads: cheaper, lower latency, and fine for the vast majority of reads where a few hundred milliseconds of staleness is invisible. Use strongly consistent reads only on the specific path where a user must immediately read their own write within the region (e.g. read-back right after an update). Note strong reads cost more capacity and are not available on GSIs, so I keep them targeted rather than blanket-on.

Worked example

Setup. Design the storage for a shopping cart service: hundreds of millions of users, massive spiky traffic (think a flash sale), every operation is "get/put this user's cart" or "add an item." Single-digit-millisecond latency, and it must not fall over under a 10× spike.

The move. This is the DynamoDB sweet spot — the access pattern is purely key-based and the scale is large and spiky. Table carts with partition key = `user_id`; the whole cart is one item (a list of line items as a document), or, if carts can be large, partition key user_id + sort key item_id so each item is its own row and "add item" is a single PutItem. Either way every operation is one hash lookup on a high-cardinality key (user ids distribute evenly — no hot partition), so latency is ~5 ms regardless of table size.

Capacity + consistency. I use on-demand capacity so the flash-sale spike auto-absorbs without me pre-provisioning for peak — exactly the workload on-demand exists for. Reads are eventually consistent by default (a cart read a few hundred ms stale is invisible), and I request a strongly consistent read only on the read-back right after checkout where the user must see their final cart.

The 400 KB rule + large carts. An item caps at 400 KB. A normal cart is tiny, but to be safe against a pathological cart I use the user_id + item_id row-per-item model so no single item grows unbounded, and "get cart" becomes a Query on the partition.

Reacting to writes. Cart abandonment emails and inventory holds are driven off DynamoDB Streams → Lambda, so the data layer emits change events without a separate CDC pipeline.

What breaks. The risk is a future query the keys don't support — "all carts containing product X" for a recall. That is a Scan (slow/expensive), so I'd add a GSI keyed on product_id (eventually consistent, its own capacity) rather than scan. And I keep an eye on any write-sharding need only if a single user could somehow get hot, which carts don't.

The result. Sub-10 ms cart operations at any scale, automatic spike absorption with zero capacity planning, no sharding or servers to operate, and change-driven side effects via Streams — the workload DynamoDB is purpose-built for.

Cheat sheet

•Managed key-value / document store: single-digit-ms latency at unlimited scale, zero sharding ops.
•Model access patterns, not entities. Enumerate queries first, design keys for each.
•Partition key = hash placement + must be high-cardinality. Sort key = order/range within it.
•Hot partition ceiling ~3K RCU / 1K WCU — write-shard skewed keys, cache hot reads (DAX).
•GSI = separate eventually-consistent copy on a new key, own capacity. Add for real patterns only.
•Capacity: on-demand (spiky/unknown, default) vs provisioned+autoscale (steady, cheaper).
•Reads eventual by default; strong on request (in-region, not on GSIs).
•Item ≤ 400 KB → large blobs in S3 with a pointer. Streams for change-driven side effects.

Drills

Why does DynamoDB give constant latency whether the table is 1 GB or 1 PB?Reveal

Because every access goes through the partition key hash to exactly one internal partition, and DynamoDB transparently splits partitions as data and throughput grow. A lookup is always "hash the key, go to one partition, read the item" — that work does not increase with table size. The trade is that you only get this for key-based access; anything requiring a scan or a non-key filter loses the guarantee.

Interviewer: "design the keys for a chat app's messages."Reveal

Partition key = conversation_id so all messages in a conversation co-locate; sort key = timestamp (or a monotonic message id) so they are ordered. Then "latest 50 messages in a conversation" is one Query(PK=conversation_id, ScanIndexForward=false, Limit=50) with the last key as the pagination cursor. If you also need "all conversations for a user," that is a separate access pattern — a GSI keyed by user_id, or a separate item type in a single-table design.

When is DynamoDB the wrong choice?Reveal

When the access patterns are unknown or change frequently (you cannot pre-design keys), when you need ad-hoc queries, joins, or aggregations, or when you need rich multi-row transactions across a relational model. Those are Postgres strengths. DynamoDB is also overkill at tiny scale where a single Postgres would do everything with more flexibility. The decision is about the shape of the queries, not "SQL bad, NoSQL good."

What it is

DynamoDB

Also worth naming: Amazon DynamoDB · DynamoDB Global Tables (multi-region) · DAX (in-memory accelerator)

~25 min read·15 sections

Dimension

Number

Why it matters

Latency

Single-digit ms (sub-ms with DAX)

Consistent regardless of table size — the headline feature

Scale

Effectively unlimited, automatic

No manual sharding; partitions split as data/throughput grows

Item size

Max 400 KB per item

Large blobs go to S3 with a pointer in the item

Per-partition limit

~3,000 RCU / 1,000 WCU

The hot-partition ceiling — why key distribution matters

Consistency

Eventual (default) or strong (per read, in-region)

Strong costs more capacity and slightly more latency

Capacity model

On-demand (auto) or provisioned (RCU/WCU)

On-demand for spiky/unknown; provisioned for steady cheap

Dimension

DynamoDB

PostgreSQL

Cassandra

Model

Managed key-value / document

Relational, rich SQL

Self-managed wide-column

Scaling

Automatic, unlimited, zero-ops

Vertical + replicas; sharding is manual

Linear by adding nodes (you operate it)

Queries

Key-based; pre-designed access patterns

Ad-hoc joins, aggregations, filters

Key-based, partition-aware

Ops burden

None (fully managed)

Moderate (you run it / RDS)

High (you run the cluster)

Best for

Known key-access at scale, ops-light

Default for relational + transactions

Write-heavy at scale, multi-region, self-hosted