Pattern·Specialised shapes

Geospatial / proximity lookup

Geohash / S2 / H3 spatial index for "nearby X" queries. Straight lat/lng on a B-tree dies at a few thousand rows.

~55 min read·16 sections

Ride-hailing, food delivery, "restaurants near me" — always a geospatial index question. Candidates who answer `WHERE lat BETWEEN ? AND ? AND lng BETWEEN ? AND ?` have already failed it.

Geohash / S2 / H3 spatial index for "nearby X" queries. Straight lat/lng on a B-tree dies at a few thousand rows.

Architecture diagram· Durable truth + Redis GEO hot tier

PostGIS holds durable state; a per-region Redis GEO tier serves sub-millisecond proximity reads; dispatch falls back to PostGIS if Redis is gone.

You’re looking at this pattern when

"Find N nearest" queries
Bounded-radius search ("within 2 km")
Real-time location tracking + dispatch

Shows up in

Uber / Lyft driver matching
DoorDash / Deliveroo restaurant discovery

Try it on

Uber Dispatch

What most people get wrong

Locations are fixed and few (< ~1000) — load into memory and linear-scan

When to reach for this

Reach for this when…

"Find N nearest" queries
Bounded-radius search ("within 2 km")
Real-time location tracking + dispatch
Map-viewport (bounding-box) queries
Geofencing / "is this point inside region X?"

Not really this pattern when…

Locations are fixed and few (< ~1000) — load into memory and linear-scan
Lookup is by region id, not coordinates (that's a plain index)
You only ever need exact-point lookup by key (primary index)

Good vs bad answer

Interviewer probe

“Find the nearest available drivers within 2 km of a pickup point.”

Weak answer

"SELECT * FROM drivers WHERE lat BETWEEN ? AND ? AND lng BETWEEN ? AND ?."

Strong answer

"That's two B-trees intersected — it fetches whole latitude and longitude bands and dies past a few thousand rows, because the cost is proportional to the bands, not the answer. I'd use a real spatial index. Durable truth is PostGIS with a GiST index on a geography column — ST_DWithin returns matches in milliseconds and KNN via <-> gives nearest-K. For the dispatch hot path I put a Redis GEO tier in front, sharded by city, for sub-millisecond GEOSEARCH; driver pings update Postgres (truth) and Redis (fast path), and if Redis dies dispatch falls back to PostGIS at higher latency while the hot tier rebuilds. The query is three stages: spatial recall from Redis, then filter to available + matching vehicle + non-stale (I attach last_seen/TTL), then rank by ETA — road-aware, not straight-line — and offer the best driver. Near a shard boundary I query the adjacent region shard too. At planet scale I'd move the hot path to H3 cells over a distributed store, query kRing neighbours, and split hot cells like stadium exits to finer resolution."

Why it wins: Rejects BETWEEN with the *reason*, names the right index and the geography column, layers durable truth + rebuildable hot tier with a fallback, separates recall/filter/rank, handles staleness and shard boundaries, uses ETA over straight-line distance, and calibrates to planet scale. The weak answer is a full scan with no spatial index, no freshness, and no ranking.

Cheat sheet

•Never BETWEEN on lat/lng — two B-trees intersected, cost scales with the dataset.
•Default durable store: PostGIS GiST on a geography column (not geometry).
•Hot path: Redis GEO for sub-ms GEOSEARCH; shard by region.
•Planet scale: H3 / S2 cells over a distributed store (Uber/Google territory).
•Boundary fix: query the target cell + neighbour ring, then filter true distance.
•Always Haversine-filter after the cell/box query — cells are approximations.
•Hot tier is a rebuildable cache; durable store owns truth and is the fallback.
•Dispatch = spatial recall → business filter → rank by ETA (not straight-line).
•Attach last_seen/TTL; exclude stale objects; tune heartbeat cadence.
•Dense cells are hot partitions hidden by average QPS — split to finer resolution.
•Resolution: coarse = fewer cells, more candidates; fine = more cells, wider ring.
•Scale tiers: <10M single PostGIS; 10–100M sharded PostGIS + Redis; 100M+ H3/S2.

Core concept

A spatial index answers "find all points within radius R of (lat, lng)" — or "the K nearest" — without scanning the dataset. The reason it needs special machinery is that two ordinary B-trees don't compose into a 2-D index: indexing latitude and longitude separately lets you fetch everything in a latitude band and everything in a longitude band, but their intersection is still an enormous candidate set spread far from your point.

Architecture diagram· Why two B-trees are not a spatial index

A lat/lng BETWEEN query fetches everything in a latitude band ∪ a longitude band and intersects — a huge candidate set far from the point.

Real spatial indexes solve this with one of two families:

Tree-based (R-tree / GiST in PostGIS): recursively partitions space into bounding boxes, so a query walks only the boxes that overlap the search region.
Cell / grid-based (geohash, Google S2, Uber H3): map each point to a hierarchical cell id (a string or integer). "Nearby" becomes "same or neighbouring cell," reducing proximity to a prefix/id lookup that any key-value store can serve.

The main systems, by job:

PostGIS (Postgres + GiST/SP-GiST): R-tree-based, handles points and polygons. The default durable truth store when you're already on Postgres.
MongoDB 2dsphere: JSON-native, simple API — fine for most apps.
Redis GEO: a geohash-backed sorted set with sub-millisecond GEOSEARCH/GEORADIUS. The ideal hot dispatch tier.
Elasticsearch geo_point: combine geo with full-text and aggregations.
S2 / H3: cell libraries for planet-scale, uniform-cell custom infrastructure (Uber territory).

Architecture diagram· Query the cell plus its 8 neighbours

Two points near a cell edge have different cell ids. Always search the target cell and its ring, then filter by true distance.

The boundary trap is the single most-tested subtlety. Two points 50 m apart but on opposite sides of a cell edge get different cell ids. Querying only the target cell silently misses them. The fix is universal across geohash/S2/H3: query the target cell plus its ring of neighbours, then filter by the true (Haversine) distance. State this unprompted and you've signalled you actually understand grid indexing rather than having memorised "use geohash."

At scale, separate truth from the hot path. A ride-hailing system persists every driver ping to a durable store (PostGIS) for recovery and analytics, while a Redis GEO tier — sharded by city/region so each shard stays small — serves the sub-millisecond dispatch loop. Redis is a rebuildable cache, not the truth: if it dies, dispatch falls back to PostGIS at higher latency and the hot tier is repopulated from the durable store.

Architecture diagram· Durable truth + Redis GEO hot tier

PostGIS holds durable state; a per-region Redis GEO tier serves sub-millisecond proximity reads; dispatch falls back to PostGIS if Redis is gone.

Interview walkthrough

Worked example: nearest available drivers within 2 km of a pickup

Architecture diagram· Durable truth + Redis GEO hot tier

PostGIS holds durable state; a per-region Redis GEO tier serves sub-millisecond proximity reads; dispatch falls back to PostGIS if Redis is gone.

Step 0 — reject the naive query. lat/lng BETWEEN is two independent B-tree lookups intersected; it fetches whole bands and its cost scales with the dataset, not the answer. Use a spatial index.

Step 1 — durable truth. PostGIS with a GiST index on a geography column. ST_DWithin(location, pickup, 2000) returns everyone within 2 km in milliseconds; ORDER BY location <-> pickup LIMIT k gives nearest-K, both index-backed. geography (not geometry) so distances are geodesically correct.

Step 2 — hot dispatch tier. Real-time matching needs sub-millisecond reads, so put a Redis GEO tier in front, sharded by city. Drivers GEOADD their position every few seconds; the update writes to Postgres (truth) and Redis (fast path). Dispatch runs GEOSEARCH ... BYRADIUS 2 km against the city shard.

Architecture diagram· Dispatch: spatial recall → business filter → rank

Spatial index returns nearby candidates; business rules (availability, vehicle, ETA) filter; a scorer ranks the survivors.

Step 3 — the matching pipeline (recall → filter → rank).

Recall: GEOSEARCH returns all candidates within (a slightly generous) 2 km.
Filter: keep only drivers who are available, match the requested vehicle type, and are not stale — each position carries last_seen, and anything older than ~10 s is excluded.
Rank: score survivors by ETA (road- and traffic-aware, not straight-line — a driver across an un-bridged river is farther than the metres suggest) plus rating and acceptance likelihood, then offer the trip to the best one.

Step 4 — boundaries and freshness.

A pickup near a city-shard edge also queries the neighbouring shard, then merges and distance-filters — the neighbour-ring idea at shard granularity.
Heartbeat cadence is tuned: faster pings on-trip or near likely matches, slower when idle, to balance battery and write load against accuracy. At fleet scale the ping firehose is ingested through a stream and only the latest position is materialised into Redis.

Architecture diagram· Location freshness via TTL + last_seen

Each location carries a last_seen timestamp; dispatch excludes stale drivers; heartbeat cadence trades battery for accuracy.

Step 5 — failure and scale.

Redis down → dispatch falls back to PostGIS (slower but correct); Redis repopulates from truth.
Dense hot cell (stadium exit) → split it to a finer resolution to spread load.
Past ~100M points / global QPS → move the hot path to H3 cells over a distributed store, querying kRing neighbours.

Result. Sub-millisecond candidate recall, road-aware ranking, fresh-only drivers, graceful degradation when the cache dies, and a clean path from one PostGIS node to planet-scale cell sharding — all anchored on a durable truth store the hot tier can always be rebuilt from.

Interview playbook

Interview playbook5-7 minutes in a 45-minute round: 1 min rejecting BETWEEN with the reason, 2 min on index choice + truth/hot-tier split, 2 min on the recall/filter/rank pipeline and boundaries, 1-2 min on freshness and planet-scale cells.

When it comes up

The prompt asks for nearest, nearby, radius, map viewport, dispatch, or geofencing
Objects (drivers, couriers, users) move and are searched by coordinate
Plain indexes on latitude and longitude would scan too much data
A ride-hailing / delivery / "near me" discovery system appears

Order of reveal

1
Reject naive range scans. Two B-trees on lat/lng are not a spatial index — they fetch whole bands and the cost scales with the dataset, not the answer.
2
Pick the index by scale. PostGIS for durable truth, Redis GEO for the sub-ms hot path, H3/S2 cells for planet scale.
3
Handle cell boundaries. Query the target cell plus its neighbour ring, then filter by true Haversine distance.
4
Separate truth from hot tier. The hot tier is a rebuildable cache; the durable store owns state and is the fallback.
5
Make dispatch recall → filter → rank. Spatial index gives candidates; filter by availability and staleness; rank by ETA, not straight-line distance.
6
Address freshness. last_seen/TTL excludes stale objects; heartbeat cadence trades battery and write load for accuracy.

Signature phrases

“Query the cells, then filter the distance”

“Redis GEO is the hot tier, not the truth”

“Cost scales with the bands, not the answer”

“ETA, not straight-line distance”

“Resolution is a fan-out vs precision dial”

“last_seen makes location perishable”

“Query the cells, then filter the distance” — Names the correct two-step spatial workflow and the boundary fix.
“Redis GEO is the hot tier, not the truth” — Shows the durable-truth-plus-rebuildable-cache instinct.
“Cost scales with the bands, not the answer” — Explains precisely why BETWEEN fails.
“ETA, not straight-line distance” — Shows distance is a proxy and ranking is road-aware.
“Resolution is a fan-out vs precision dial” — Connects H3/S2 cell size to candidate count and ring width.
“last_seen makes location perishable” — Names freshness as a first-class concern.

Likely follow-ups

?“What if the pickup is near a shard / cell boundary?”Reveal

Query the neighbouring cells (kRing) or adjacent region shards as well, merge the candidates, and filter by true distance. Boundaries are an internal concern — the system owns the overlap so the user never sees a missed neighbour. For nearest-K I expand rings outward until no unsearched ring could contain a closer point than the current K-th result.

?“What if driver pings are stale?”Reveal

Every position carries last_seen (and often a TTL so it auto-expires). Dispatch excludes anyone older than a threshold — a 40-second-old position sends a rider to where the car was. I tune heartbeat cadence adaptively: faster on-trip or near a likely match, slower when idle in a quiet area, to balance battery and write load against accuracy. At fleet scale I ingest the ping firehose through a stream and only materialise the latest position.

?“A downtown cell is melting while global QPS looks fine. What do you do?”Reveal

That's a hot partition hidden by averages, so I watch per-cell query and update rates, not just global QPS. Mitigations stack: shard by region to isolate the metro, split the hot cell to a finer resolution so its load spreads across more partitions, and cache the popular viewport query with a short TTL since many users request the same area at once.

?“Why not just sort everyone by distance?”Reveal

Computing distance to every object is O(N) per query and ignores the index entirely — it's the BETWEEN problem in a different shape. The spatial index limits the candidate set to objects actually near the point (cells/box), and only then do I compute exact distance on that small set. Recall is cheap and index-backed; the expensive distance/ETA math runs on a bounded candidate set.

Canonical examples

→Uber / Lyft driver matching
→DoorDash / Deliveroo restaurant discovery
→Yelp / Google Maps "nearby" search
→Pokémon Go / AR proximity triggers
→ATM / store locator, geofenced notifications

Variants

PostGIS (GiST / SP-GiST)

R-tree spatial index in Postgres — durable truth, points and polygons, ST_DWithin in milliseconds.

Architecture diagram· Durable truth + Redis GEO hot tier

PostGIS holds durable state; a per-region Redis GEO tier serves sub-millisecond proximity reads; dispatch falls back to PostGIS if Redis is gone.

The default when you're already on Postgres. A geography column with a GiST index supports ST_DWithin (radius), <-> KNN ordering (nearest-K), and full polygon operations (point-in-region, intersections). One system gives you durability, transactional consistency with the rest of your data, and rich geometry.

The critical detail is `geography` vs `geometry`: geometry does planar (flat-plane) math, which is wrong for lat/lng over any real distance; geography does geodesic math on the sphere, which is correct. Mixing them silently produces wrong distances. Use geography for lat/lng proximity.

PostGIS comfortably handles up to tens of millions of points on one well-indexed node. Past that, or when you need sub-millisecond dispatch latency, it becomes the durable truth behind a faster hot tier rather than the hot path itself.

Pros

+Durable, transactional, and handles polygons + complex shapes
+One system — no separate store to keep in sync
+ST_DWithin and KNN (<->) both index-backed

Cons

−Higher latency than an in-memory hot tier
−Single index strains past tens of millions of points
−geography vs geometry mistakes break distances silently

Choose this variant when

You're already on Postgres and want one durable store
You need polygons / geofencing, not just points
Up to tens of millions of points, moderate QPS

Redis GEO hot tier

Geohash sorted set with sub-millisecond GEOSEARCH — the dispatch fast path, sharded by region.

Architecture diagram· Shard the hot tier by region

A single global index outgrows a node; partition by city/region so each Redis shard stays small and dispatch routes by pickup location.

Redis GEO stores points in a sorted set keyed by their geohash, so GEOSEARCH ... BYRADIUS returns nearby members in sub-millisecond time. This is exactly what a real-time dispatch loop needs: drivers GEOADD their position every few seconds, and each rider request runs a radius search against the hot set.

Roughly 1M points fit in ~200 MB, so shard by region (city, metro, country) to keep each shard small, hot, and independently scalable — and so a dense downtown doesn't make one shard the bottleneck. Redis is a derived cache, not the truth: it's rebuilt from the durable store on failure, and dispatch falls back to PostGIS (slower) while it repopulates. Attach a TTL / last_seen so stale driver positions age out automatically.

Pros

+Sub-millisecond radius queries for the dispatch loop
+Trivial GEOADD / GEOSEARCH API, JSON-free
+Region sharding keeps each shard small and hot

Cons

−Not durable — a derived cache that must be rebuildable
−Points only; no polygons or complex geometry
−Dense regions create hot shards needing finer splits

Choose this variant when

Real-time dispatch needs sub-ms proximity reads
High write rate of moving objects (drivers, couriers)
You already have a durable truth store behind it

H3 / S2 cell index

Hierarchical cell ids over a key-value store for planet-scale, uniform-cell custom indexing.

Architecture diagram· H3 resolution is a fan-out vs precision dial

Coarse cells mean fewer cells to query but more candidates to filter; fine cells mean precise candidates but a wider neighbour ring.

Google's S2 (spherical quadtree, square-ish cells) and Uber's H3 (hexagonal grid) reduce proximity to cell membership: each point maps to a cell id at a chosen resolution, "nearby" is "same or neighbouring cell" (kRing in H3), and the cell id is just a key in any distributed KV store. This decouples spatial indexing from any single database and scales horizontally to the whole planet.

Hexagons (H3) have a nice property: all six neighbours are equidistant from the centre, which makes ring-based "expand the search" logic uniform — unlike a square grid where edge and corner neighbours differ. Resolution is the key dial: coarse cells mean fewer cells to query but more candidates to distance-filter; fine cells mean precise candidates but a wider neighbour ring. You choose resolution by the query radius and the local density. This is the most application-logic-heavy option — you own the cell math, the neighbour expansion, and the distance filter — so reach for it only at the scale where PostGIS + Redis stops being enough.

Pros

+Planet-scale, store-agnostic — cell id is just a KV key
+Uniform hex neighbours simplify ring expansion (H3)
+Decouples spatial indexing from any single DB

Cons

−Most application logic — you own cells, rings, and filtering
−Resolution choice is a real precision-vs-fan-out trade-off
−Overkill below hundreds of millions of points

Choose this variant when

Hundreds of millions of points / planet-scale dispatch
You need a store-agnostic, horizontally-sharded index
Uniform cell analytics (surge zones, supply heatmaps)

Elasticsearch geo_point

Geo filtering fused with full-text and aggregations when search and location combine.

Architecture diagram· Dispatch: spatial recall → business filter → rank

Spatial index returns nearby candidates; business rules (availability, vehicle, ETA) filter; a scorer ranks the survivors.

When a query is "vegan restaurants within 2 km, open now, rated 4+," you're combining geo, full-text, structured filters, and facets — and Elasticsearch's geo_point / geo_distance query does it in one place alongside BM25 and aggregations. It indexes points with a BKD-tree and supports radius, bounding-box, and polygon filters as ordinary clauses.

As with all search engines, treat the index as a derived view of a durable truth store, fed asynchronously (CDC/outbox), never as the source of truth. This is the right call when location is one filter among several in a discovery experience, rather than the high-write dispatch loop Redis GEO is built for.

Pros

+Geo + full-text + facets + aggregations in one query
+Bounding-box, radius, and polygon filters built in
+Great for discovery UX (filter + sort + facet)

Cons

−A derived index to keep fresh, not durable truth
−Heavier write path than Redis for moving objects
−Operational weight of a search cluster

Choose this variant when

Geo is one filter among text + facets in a discovery UI
Low-to-moderate location update rate (venues, not drivers)
You already run Elasticsearch for search

Scaling path

v1 — PostGIS single spatial index

Correct proximity queries from day one with one durable store.

Add a geography column, a GiST index, and query with ST_DWithin for radius or ORDER BY location <-> point LIMIT k for nearest-K. Both are index-backed and return in milliseconds. Durable, transactional, and handles polygons if you need geofencing.

Architecture diagram· Durable truth + Redis GEO hot tier

PostGIS holds durable state; a per-region Redis GEO tier serves sub-millisecond proximity reads; dispatch falls back to PostGIS if Redis is gone.

This is genuinely the right answer up to tens of millions of points and moderate QPS. The trigger to move on is the real-time dispatch loop: when sub-millisecond proximity reads at high write rates start to strain a single Postgres node.

What triggers the next iteration

Per-query latency too high for a tight dispatch loop
High-frequency location writes contend with reads
A single index strains past tens of millions of points

v2 — Redis GEO hot tier in front of PostGIS

Sub-millisecond dispatch reads while keeping durable truth.

Keep PostGIS as truth and add a Redis GEO hot tier. Driver pings update both — Postgres for durability, Redis for the fast path. Dispatch runs GEOSEARCH against Redis for sub-millisecond candidates and only touches Postgres on a Redis miss/outage.

Architecture diagram· Dispatch: spatial recall → business filter → rank

Spatial index returns nearby candidates; business rules (availability, vehicle, ETA) filter; a scorer ranks the survivors.

The hot tier is a rebuildable cache: if Redis dies, dispatch degrades to PostGIS latency and Redis is repopulated from truth. Add TTL/last_seen so stale positions expire automatically.

What triggers the next iteration

A single Redis instance fills as the fleet grows (~200 MB/1M points)
Dense regions concentrate load on one instance
Dual-writing truth + cache must stay consistent on failure

v3 — shard the hot tier by region

Keep each shard small, hot, and independently scalable.

Partition the Redis GEO tier by city/region so each shard holds one metro's points and dispatch routes a request to the shard for its pickup location. This keeps shards small, isolates a busy city from a quiet one, and lets you scale hot regions independently.

Architecture diagram· Shard the hot tier by region

A single global index outgrows a node; partition by city/region so each Redis shard stays small and dispatch routes by pickup location.

Now you must handle shard-boundary queries: a pickup near a region edge searches the home shard and the adjacent one, then merges and distance-filters — the same neighbour-ring idea, one level up.

What triggers the next iteration

Pickups near region edges must query adjacent shards
A dense downtown cell is still a hot spot within its shard
Routing logic must map coordinates → correct shard

v4 — H3/S2 cell index for planet scale

Store-agnostic, horizontally-sharded indexing with hot-cell control.

Move the hot path to H3/S2 cells over a distributed KV store. Each point's cell id is its shard key; proximity is kRing neighbour expansion plus a true-distance filter. Resolution is tuned per use case, and dense "hot cells" (stadium exits, airports) can be split to a finer resolution to spread load.

Architecture diagram· H3 resolution is a fan-out vs precision dial

Coarse cells mean fewer cells to query but more candidates to filter; fine cells mean precise candidates but a wider neighbour ring.

This is Uber-scale territory. The cell grid also doubles as the substrate for supply/demand analytics and surge zones, since uniform cells aggregate cleanly. The cost is owning all the cell math yourself.

What triggers the next iteration

Hot cells (events, airports) still need finer-resolution splits
Resolution choice trades neighbour-ring width vs candidate count
All cell/ring/filter logic lives in application code

Deep dives

Why two B-trees are not a spatial index

Architecture diagram· Why two B-trees are not a spatial index

A lat/lng BETWEEN query fetches everything in a latitude band ∪ a longitude band and intersects — a huge candidate set far from the point.

The most common wrong answer is WHERE lat BETWEEN ? AND ? AND lng BETWEEN ? AND ?, and understanding exactly why it fails is what separates a real answer.

Architecture diagram· Why two B-trees are not a spatial index

A lat/lng BETWEEN query fetches everything in a latitude band ∪ a longitude band and intersects — a huge candidate set far from the point.

A B-tree on lat can efficiently return every row inside a latitude band; a B-tree on lng can return every row inside a longitude band. But these are two independent one-dimensional lookups. The query planner fetches one band and the other and intersects them — and the intersection of "everyone at this latitude" with "everyone at this longitude" is a candidate set proportional to the bands, not to the answer. In a dense city, a latitude band alone can contain hundreds of thousands of points stretching east-west across the whole map; you pay to fetch and intersect all of them to find the few actually near your point.

A true spatial index is inherently two-dimensional: an R-tree partitions the plane into nested bounding boxes so a query descends only into boxes that overlap the search region; a grid index (geohash/S2/H3) maps the 2-D point to a single locality-preserving cell id so nearby points share a key prefix. Either way the work is proportional to the number of points near the query, not to a band across the entire dataset. That's the whole game: collapse two dimensions into one index that preserves locality.

Cell boundaries: the classic trap (and the fix)

Architecture diagram· Query the cell plus its 8 neighbours

Two points near a cell edge have different cell ids. Always search the target cell and its ring, then filter by true distance.

Every grid-based index — geohash, S2, H3 — has the same Achilles' heel: two points can be metres apart yet fall in different cells, so a single-cell lookup silently misses near neighbours sitting just across an edge.

Architecture diagram· Query the cell plus its 8 neighbours

Two points near a cell edge have different cell ids. Always search the target cell and its ring, then filter by true distance.

The fix is universal and you should state it unprompted: query the target cell plus its ring of neighbours, then filter by the true distance. For geohash that's the 8 adjacent cells; for H3 it's kRing(cell, k); for S2 it's the neighbouring cells at your level. After gathering candidates from all those cells, compute the actual Haversine (great-circle) distance and drop anything beyond R.

For nearest-K the logic generalises to a ring-expansion: start at the target cell, expand outward ring by ring, maintaining a running top-K heap, and stop only when the nearest possible point in any unsearched ring is farther than your current K-th result. For a large radius you may need to query many rings — at which point a coarser resolution (bigger cells, fewer of them) is the better trade. This is precisely why resolution selection matters: it sets how many cells a given radius touches.

Separate durable truth from the rebuildable hot tier

Architecture diagram· Durable truth + Redis GEO hot tier

PostGIS holds durable state; a per-region Redis GEO tier serves sub-millisecond proximity reads; dispatch falls back to PostGIS if Redis is gone.

A production location system has two very different jobs: remember where everything is (durable, analytical, recoverable) and answer "who's near me right now" in under a millisecond (hot, volatile, high-write). Trying to do both in one store is the architectural mistake.

Architecture diagram· Durable truth + Redis GEO hot tier

PostGIS holds durable state; a per-region Redis GEO tier serves sub-millisecond proximity reads; dispatch falls back to PostGIS if Redis is gone.

So you layer them. PostGIS (or any durable spatial store) is the truth: every driver ping is persisted for recovery, billing, trip history, and analytics. Redis GEO is the hot tier: a derived, in-memory index that serves the dispatch loop and can be thrown away and rebuilt from truth at any time. Dispatch reads Redis; if Redis is unavailable, it falls back to PostGIS at higher latency rather than failing — graceful degradation, not an outage.

The discipline that makes this safe: the hot tier must be reconstructible from truth, writes go to truth first (or at least durably), and stale entries age out via TTL/last_seen so a crashed client's last position doesn't haunt dispatch forever. The same two-tier shape recurs everywhere in this curriculum — a durable system of record behind a fast derived view — and naming it explicitly signals senior thinking.

Dense areas create hot partitions — and average QPS hides them

Architecture diagram· Shard the hot tier by region

A single global index outgrows a node; partition by city/region so each Redis shard stays small and dispatch routes by pickup location.

Geospatial load is brutally non-uniform. A stadium at closing time, an airport arrivals curb, or a downtown core at rush hour concentrates thousands of drivers and riders into a handful of cells. Your average global QPS looks fine while one cell is melting.

Architecture diagram· Shard the hot tier by region

A single global index outgrows a node; partition by city/region so each Redis shard stays small and dispatch routes by pickup location.

The mitigations stack:

Shard by region so a busy metro is isolated from quiet ones and can scale independently.
Split hot cells to a finer resolution dynamically, so the load of one dense area spreads across more partitions instead of one.
Cache popular viewport queries (the map everyone is staring at) with a short TTL, since many users request the same area simultaneously.
Track per-cell query and update rates, not just global averages — the metric that reveals the hot spot is the per-partition one.

There's also a demand-side angle: those same dense cells are where surge pricing and supply-positioning algorithms run, because the uniform cell grid is a natural substrate for aggregating supply and demand. So the cell index isn't only a lookup structure — at scale it's also the coordinate system for the marketplace's balancing logic.

Freshness, staleness, and the heartbeat trade-off

Architecture diagram· Location freshness via TTL + last_seen

Each location carries a last_seen timestamp; dispatch excludes stale drivers; heartbeat cadence trades battery for accuracy.

Moving objects make location data perishable. A driver's position from 40 seconds ago is worse than useless for dispatch — it can send a rider to where the car was. So freshness is a first-class design concern, not an afterthought.

Architecture diagram· Location freshness via TTL + last_seen

Each location carries a last_seen timestamp; dispatch excludes stale drivers; heartbeat cadence trades battery for accuracy.

Every stored position carries a `last_seen` timestamp (and often a TTL so it auto-expires). Dispatch filters out anyone whose last_seen is older than a threshold (say, 10 seconds) so it only ever offers genuinely-present drivers. The tension is the heartbeat cadence: pinging every second gives near-perfect accuracy but drains battery and floods the write path with millions of GEOADDs; pinging every 30 seconds is cheap but lets dispatch act on stale data. Production systems tune this adaptively — faster heartbeats when a driver is on a trip or near a likely match, slower when idle in a quiet area.

This also shapes the write architecture: at millions of moving objects, the location-update stream is a high-throughput firehose in its own right. It's common to ingest pings through a queue or stream (so a burst doesn't overwhelm the store), debounce/aggregate them, and only materialise the latest position into the hot tier — turning a write-heavy firehose into a manageable "latest known position per object" view.

Proximity is recall — dispatch is recall + filter + rank

Architecture diagram· Dispatch: spatial recall → business filter → rank

Spatial index returns nearby candidates; business rules (availability, vehicle, ETA) filter; a scorer ranks the survivors.

"Find drivers within 2 km" is only the first stage of a real matching system, and strong candidates make the distinction explicit: the spatial index provides recall, but the actual decision adds filtering and ranking.

Architecture diagram· Dispatch: spatial recall → business filter → rank

Spatial index returns nearby candidates; business rules (availability, vehicle, ETA) filter; a scorer ranks the survivors.

1Spatial recall. GEOSEARCH (or ST_DWithin/kRing) returns every driver inside the radius — a cheap, index-backed candidate set, typically over-fetched a little so the next stages have room to filter.
2Business filter. Drop anyone who isn't available, doesn't match the requested vehicle type, is stale (old last_seen), or is already on a trip. This is a yes/no pass that doesn't care about ranking.
3Rank. Score the survivors by what actually matters — usually ETA (which accounts for road network and traffic, not just straight-line distance), plus driver rating, acceptance likelihood, and marketplace fairness — and offer the trip to the best one.

The crucial insight is that geometric distance is a proxy, not the objective: a driver 1.5 km away across a river with no nearby bridge has a worse ETA than one 2 km away on a direct road. So the spatial index deliberately returns a generous candidate set, and the ranking stage applies the expensive, road-aware ETA model to just those candidates — the same cheap-recall-then-expensive-rank shape you see in search. Keeping these stages separate keeps the hot spatial query fast while letting the matching logic be as smart as it needs to be.

Decision levers

Primary / truth store

PostGIS if you're on Postgres (durable, polygons, transactional). MongoDB 2dsphere if you're on Mongo. For pure high-scale location workloads, a durable truth store behind a Redis-first or H3-cell hot path.

Hot-path tier

Redis GEO for sub-millisecond reads (~200 MB per 1M points). Shard by city/region to keep each shard small and isolate dense areas. The hot tier is a rebuildable cache, never the truth.

Index family

Tree-based (PostGIS GiST/R-tree) for rich geometry on one node. Cell-based (geohash/S2/H3) for store-agnostic, horizontally-sharded, planet-scale indexing. Both require a true-distance filter after the coarse query.

Query shape

Radius (ST_DWithin / GEOSEARCH BYRADIUS) and nearest-K (<-> / ring expansion) for dispatch; bounding box for map viewports; polygon/containment for geofencing. Pick the index that serves your dominant shape.

Resolution (cell systems)

Coarse cells = fewer cells queried but more candidates to filter; fine cells = precise candidates but a wider neighbour ring. Tune by query radius and local density; split hot cells to finer resolution to spread load.

Freshness

Attach last_seen/TTL to every position; exclude stale objects from dispatch. Tune heartbeat cadence (battery + write load vs accuracy), adaptively faster on-trip and slower when idle. Ingest the ping firehose through a stream at scale.

Scale tier

< ~10M points: single PostGIS index. 10–100M: sharded PostGIS + Redis GEO hot tier. 100M+ with dispatch QPS: H3/S2 cells over a distributed store (Uber territory).

Failure modes

BETWEEN on lat/lng

Two independent B-trees intersected — fetches whole latitude/longitude bands and dies past ~10k rows, with cost proportional to the bands, not the answer. Always use a real spatial index.

Querying one cell without neighbours

Two points metres apart across a cell edge have different cell ids, so a single-cell lookup silently misses them. Query the target cell plus its neighbour ring, then filter by true distance.

Skipping the true-distance filter

Cells and bounding boxes are approximations — a corner of the queried region can be well beyond R. Always compute the actual Haversine distance and drop candidates outside the radius.

geometry vs geography confusionAdvanced

PostGIS geometry uses planar math (wrong for lat/lng over distance); geography uses geodesic math (correct). Mixing them silently returns wrong distances.

Treating the hot tier as truthAdvanced

Redis GEO is a derived, volatile cache. If it's the only copy, a flush loses every location. Persist to a durable store and make the hot tier rebuildable, with PostGIS fallback on outage.

One giant global indexAdvanced

100M+ points in a single index strains the node and concentrates dense-area load. Shard by region and split hot cells to finer resolution.

Ignoring stalenessAdvanced

Dispatching on a 40-second-old position sends riders to where a driver was. Attach last_seen/TTL, exclude stale objects, and tune heartbeat cadence.

Case studies

Uber

Uber — H3 hexagonal grid as the coordinate system for the marketplace

Uber built and open-sourced H3, a hierarchical hexagonal geospatial index, because the obvious alternatives didn't fit a planet-scale, real-time marketplace. Latitude/longitude with range queries doesn't scale; square grids (and geohash) have the awkward property that a cell's edge neighbours and corner neighbours are at different distances, which complicates "expanding ring" searches and smoothing.

Hexagons solve this: every one of a hexagon's six neighbours is equidistant from the centre, so ring-based neighbour expansion (kRing) and spatial smoothing are uniform. H3 is hierarchical — each cell subdivides into finer children — so Uber can pick a resolution per task: coarse cells for city-level supply/demand analytics, fine cells for precise dispatch.

Critically, H3 is more than a lookup structure for Uber — it's the coordinate system for the entire marketplace. Supply and demand are bucketed into cells to compute surge pricing, position idle drivers toward predicted demand, and visualise the network. Because cells are uniform and store-agnostic (a cell id is just an integer key), the same index sharded across a distributed store serves both the sub-second dispatch query and the batch analytics that balance the market.

The lesson for an interview: at planet scale the spatial index isn't a detail bolted onto a database — it's the foundational coordinate system, and the choice of cell shape and resolution has consequences far beyond "find nearby."

Takeaway: At planet scale the cell grid (H3) becomes the marketplace's coordinate system — hexagons give uniform neighbours for clean ring expansion, hierarchy lets you pick resolution per task, and integer cell ids shard across any store.

Google

Google S2 — projecting the sphere onto a quadtree of cells

Google's S2 library underpins geospatial features across Google Maps, and its core idea is elegant: project the sphere onto the six faces of a cube, then recursively subdivide each face into a quadtree of cells, numbering them along a Hilbert space-filling curve. The Hilbert curve is the trick — it maps 2-D cells to 1-D 64-bit integer ids such that cells close in space stay close in id, so a spatial range becomes a small number of contiguous integer ranges.

That property is what makes S2 so practical: because each cell is a single sortable integer that preserves locality, you can index points in an ordinary B-tree or any key-value store and answer "points near here" as a handful of integer range scans — no specialised spatial datastore required. S2 cells are hierarchical (30 levels, from the whole face down to roughly a square centimetre) so you choose resolution by use case, and the library provides the neighbour, covering, and containment operations a region query needs.

S2 covers an arbitrary region (a circle, a polygon) by computing the minimal set of cells that tile it, then querying those cell-id ranges — the covering plus a true-distance filter. The takeaway parallels H3: reduce 2-D proximity to 1-D locality-preserving keys so commodity stores can serve spatial queries, and let resolution be a tunable dial.

Takeaway: S2 projects the sphere to a Hilbert-curve-numbered quadtree so each cell is a locality-preserving 64-bit integer — spatial queries become integer range scans any B-tree or KV store can serve, with resolution as a tunable dial.

Tinder

Tinder — geosharding recommendations by location

Tinder's core query is intensely geospatial: show a user a stack of nearby, eligible profiles, fast, at enormous scale. Their engineering team described moving to geosharding — partitioning the recommendation index by geographic region rather than by a random user-id hash.

The motivation is locality of access: a user in a given city overwhelmingly needs candidates from that same region, so co-locating each region's profiles on the same shard means a recommendation query hits one (or a few adjacent) shards instead of fanning out across the whole fleet. They map regions using a geohash/S2-style cell scheme and assign cells to shards, balancing shard sizes against population density — a dense metropolitan cell may be its own shard while a sparse rural area groups many cells together. This is the exact density-aware sharding this pattern describes: uniform cells, non-uniform population, so shard boundaries follow demand, not geometry.

Boundary handling matters here too — a user near a region edge must draw candidates from neighbouring shards — which is the cell-neighbour problem one level up at the shard granularity. The payoff was dramatically reduced query fan-out and latency: most recommendation requests became local to a shard.

Takeaway: Geosharding co-locates each region's data on the same shard so location-scoped queries stay local instead of fanning out — and shard boundaries must follow population density, with edge users querying neighbouring shards.

Decision table

Geospatial choices depend on query shape, update rate, durability, and point count.

Choice	Best for	Trade-off	Robust answer includes
PostGIS (GiST)	Durable truth + polygons/geofencing	Higher latency than an in-memory tier	geography vs geometry, ST_DWithin, KNN <->
Redis GEO	Sub-ms dispatch reads, high write rate	Derived cache, not durable	Region sharding, TTL/last_seen, rebuild path
H3 / S2 cells	Planet-scale, store-agnostic indexing	You own cell/ring/filter logic	Resolution choice, kRing neighbours, distance filter
Elasticsearch geo	Geo + text + facet discovery	Derived index, heavier writes	DB truth, async indexing, geo_distance filter

Whatever the index, always filter candidates by true (Haversine) distance after the cell/box query.
At scale, separate durable truth (PostGIS) from the rebuildable hot tier (Redis GEO / H3 cells).

Drills

Why does `lat/lng BETWEEN` fail?Reveal

It's two independent B-tree lookups intersected. The latitude B-tree returns everyone in a latitude band, the longitude B-tree everyone in a longitude band, and you fetch and intersect both — so the cost is proportional to the bands (potentially the whole map east-west), not to the handful of points actually near you. A spatial index walks only the cells/boxes overlapping the query region, so work scales with the answer.

Explain the cell-boundary problem and its fix.Reveal

Grid indexes (geohash/S2/H3) assign each point a cell id, but two points metres apart can sit on opposite sides of a cell edge and get different ids — so querying only the target cell silently misses near neighbours. The fix is universal: query the target cell plus its ring of neighbours (the 8 adjacent cells / kRing / S2 neighbours), gather all candidates, then filter by the true Haversine distance to drop anything beyond R.

How do you do nearest-K on a cell index?Reveal

Ring expansion with a running top-K heap. Start at the target cell, search outward ring by ring, keep the K closest seen so far, and stop only when the nearest possible point in any unsearched ring is farther than your current K-th result — at which point no further ring can improve the answer. In PostGIS the ORDER BY location <-> point LIMIT k operator does this natively via the GiST index.

Why separate PostGIS truth from a Redis GEO hot tier?Reveal

Two different jobs: durable, recoverable, analytical storage (PostGIS) versus sub-millisecond high-write dispatch reads (Redis GEO). Redis is a derived, volatile cache — fast but losable — so it can be thrown away and rebuilt from truth, and dispatch falls back to PostGIS (slower but correct) if Redis is down. Making Redis the only copy would mean a flush loses every location.

Why rank dispatch by ETA instead of straight-line distance?Reveal

Geometric distance is a proxy for what actually matters — how fast a driver can reach the rider. A driver 1.5 km away across a river with no nearby bridge has a worse real ETA than one 2 km away on a direct road. So the spatial index returns a generous candidate set by distance (cheap recall), and the ranking stage applies a road- and traffic-aware ETA model to just those candidates (expensive, but bounded).

A downtown cell is overloaded but global QPS looks fine. Diagnose and fix.Reveal

Geospatial load is non-uniform — a dense area concentrates thousands of objects and queries into a few cells while averages look healthy. Diagnose by tracking per-cell query and update rates, not global QPS. Fix by sharding by region to isolate the metro, splitting the hot cell to a finer resolution so its load spreads across more partitions, and caching the popular viewport query with a short TTL since many users request the same area simultaneously.

When to reach for this

Choice

Best for

Trade-off

Robust answer includes

PostGIS (GiST)

Durable truth + polygons/geofencing

Higher latency than an in-memory tier

geography vs geometry, ST_DWithin, KNN <->

Redis GEO

Sub-ms dispatch reads, high write rate

Derived cache, not durable

Region sharding, TTL/last_seen, rebuild path

H3 / S2 cells

Planet-scale, store-agnostic indexing

You own cell/ring/filter logic

Resolution choice, kRing neighbours, distance filter

Elasticsearch geo

Geo + text + facet discovery

Derived index, heavier writes

DB truth, async indexing, geo_distance filter