Pick a problem. Practice or take a mock.

Name: SystemRound
Availability: InStock

Every problem ships with a checklist of what strong answers cover, a stage-by-stage workspace, and a debrief that points you to the exact concept to study next.

View Design a URL Shortener (Bit.ly)Browse lessons

23 problems

easySystem design
45m
Design a URL Shortener (Bit.ly)
Design a URL shortening service like Bit.ly that takes long URLs and creates short, unique aliases. Users should be able to create short URLs, be redirected when visiting them, and optionally track click analytics. The system should handle millions of URLs and redirect requests with minimal latency.
Full checklist + reference answer
hashingdatabasecachingscalability
View problem5 stages
mediumSystem design
45m
Design a Rate Limiter
Design a rate limiting system that controls the rate of requests a client can send to an API. The rate limiter should support different rate limiting algorithms, work in a distributed environment, and provide clear feedback to clients when they are throttled. Consider how this would work as both a standalone service and as middleware.
Full checklist + reference answer
distributed-systemsalgorithmscachingmiddleware
View problem5 stages
mediumSystem design
45m
Design a Chat System (WhatsApp)
Design a real-time chat application like WhatsApp or Messenger that supports 1:1 and group messaging (up to 100 participants). Users should be able to send text and media messages, receive messages in real time when online, and retrieve undelivered messages when they come back online. The system must handle billions of users with low-latency delivery and guaranteed message durability.
Full checklist + reference answer
real-timewebsocketspub-submessaging
View problem5 stages
mediumSystem design
45m
Design a News Feed (Facebook/Twitter)
Design a social media news feed system like Facebook's News Feed or Twitter's Home Timeline. When a user opens the app, they see a personalized feed of posts from people and pages they follow, ranked by relevance. The system must handle billions of users, hundreds of millions of posts per day, and deliver a fresh feed within seconds. The key design decision is how to generate the feed: pre-compute it on write (fan-out-on-write) or assemble it on read (fan-out-on-read).
Full checklist + reference answer
fan-outcachingsocial-graphranking
View problem5 stages
mediumSystem designPro
45m
Design a Notification Service
Design a notification system that can send notifications to users across multiple channels — push notifications (iOS/Android), SMS, and email. The system must handle billions of notifications per day, support user preferences and opt-outs, de-duplicate notifications, and ensure reliable delivery with prioritization. Upstream services (e.g., orders, social, marketing) publish notification events; the notification service handles routing, templating, channel selection, and delivery.
Full checklist + reference answer
messagingqueuesmulti-channeldelivery
View problem5 stages
mediumSystem design
45m
Design Search Autocomplete (Typeahead)
Design a search autocomplete (typeahead) system like Google Search suggestions. As a user types each character, the system returns the top 5-10 most relevant search suggestions within 100ms. The system must handle billions of queries per day, rank suggestions by popularity and personalization, and update its suggestion corpus as new queries emerge.
Full checklist + reference answer
triecachingrankingprefix-search
View problem5 stages
hardSystem designPro
45m
Design a Ticket Booking System (Ticketmaster)
Design a ticket booking system like Ticketmaster or BookMyShow for live events (concerts, sports). Users browse events, select seats from a venue map, and complete payment — all while thousands of other users compete for the same seats. The system must prevent double-booking (two users paying for the same seat), handle massive traffic surges when popular events go on sale, and provide a fair booking experience under extreme contention.
Full checklist + reference answer
consistencylockingcontentionpayment
View problem5 stages
hardSystem designPro
50m
Design an AI Inference Gateway
Design the inference layer for a B2B platform whose customers embed LLM features in their apps. Each tenant sends millions of prompt requests per day across multiple foundation-model providers. The gateway must route by tenant policy, cache responses safely, fall back across providers on outage, enforce per-tenant cost and rate budgets, stream tokens end-to-end, and keep an auditable log of every call. Cost runaway, provider outages, and noisy-neighbor tenants are the day-1 production failure modes.
Full checklist + reference answer
aistreamingcachingmulti-tenant
View problem5 stages
hardSystem designPro
50m
Design a Webhook Delivery Service
Design the webhook layer for a B2B platform (think Stripe events, GitHub webhooks). When something happens internally (charge succeeded, PR opened), an event must be delivered to every customer endpoint subscribed to that event type — durably, signed, and with retries on failure. Customers get a dashboard to inspect delivery attempts and replay failed events. Endpoints fail constantly: timeouts, 5xx, slow consumers, customers down for hours. The hard problems are slow-customer isolation, retry-storm protection, and dashboard replay without overwhelming a recovering customer.
Full checklist + reference answer
webhooksat-least-oncequeuesretries
View problem5 stages
hardSystem designPro
50m
Design a Coding Practice Platform (LeetCode/HackerRank)
Design a platform where users solve programming problems, submit code, run public samples, get judged against hidden tests, join timed contests, and see rankings. The hard parts are sandboxed execution, queue isolation across languages and tenants, hidden test-case secrecy, contest traffic spikes, leaderboard freshness, plagiarism signals, and trustworthy judging results.
Standard grader · richer checklist coming soon
code-executionsandboxingqueuescontests
View problem5 stages
mediumSystem designPro
45m
Design a Product Launch Platform (Product Hunt)
Design a community platform where makers launch products, users vote and comment, and each day has a ranked launch feed. The hard parts are ranking freshness, vote manipulation, launch-day traffic spikes, moderation, notifications, search/discovery, and making ranks rebuildable when scoring rules change.
Standard grader · richer checklist coming soon
rankingfeedsvotingmoderation
View problem5 stages
easySystem design
45m
Design Pastebin (Pastebin.com / GitHub Gist)
Design a pastebin service where users submit a blob of text or code and get back a short, shareable URL. Pastes can be public or private and may carry an optional expiration. Reads dominate writes ~100:1, payloads range from a few kilobytes to several megabytes, and the system must bound its own storage growth — which makes tiered storage, retention, and unguessable IDs the decisions that separate a passing answer from a strong one.
Full checklist + reference answer
blob-storagecachingretention-ttlobject-storage
View problem5 stages
mediumSystem designPro
45m
Design a Feature Flag Platform (LaunchDarkly)
Design a feature-flag and rollout-control platform like LaunchDarkly. Applications ask "is this feature on for this user?" millions of times a second, so the answer has to be instant — which is the whole twist: evaluation happens locally inside the SDK, not as a network call. The hard system-design surface is the control plane: distributing rule changes to ~100M long-lived SDK connections within seconds, bucketing percentage rollouts deterministically so a user never flickers, and staying available for customer apps even when your own service is down.
Full checklist + reference answer
control-planestreamingedge-distributionconsistent-hashing
View problem5 stages
hardSystem designPro
45m
Design a Payments Ledger (double-entry)
Design the core money ledger behind a payments platform. Every customer, merchant, fee pool, and FX reserve is an account; money moves between them as transfers. The catch is that this is money — you can never lose a cent, never create one, never double-post a retried request, and you must be able to prove years later exactly how any balance came to be. The naive "UPDATE accounts SET balance = balance - amount" is the trap; the real design is an immutable double-entry journal where balances are derived, postings are atomic and idempotent, and the books always reconcile to zero.
Full checklist + reference answer
ledgerdouble-entryconsistencyidempotency
Preview problem5 stages
hardSystem designPro
45m
Design a Collaborative Document Editor (Google Docs)
Design the editing backbone of a Google Docs–style product: many people open the same document and type at the same time. Each editor sees their own keystrokes instantly and everyone else’s within a fraction of a second, with live cursors and presence. The defining challenge is convergence — concurrent edits must never be silently lost, and every client must end up with an identical document. That forces a real conflict-resolution model (OT or CRDT), a server-assigned total order per document, and an append-only operation log with periodic snapshots, rather than last-write-wins on a document blob.
Full checklist + reference answer
real-timeoperational-transformationcrdtwebsockets
View problem5 stages
hardSystem designPro
45m
Design a Content Moderation Pipeline (Instagram / TikTok)
Design the moderation backbone of a large UGC platform: ~500M pieces of content a day — images, video, and text — must be screened for policy violations (CSAM, nudity, violence, hate, spam) before causing harm. The defining challenge is that you can neither run every model synchronously on upload (it would add seconds of latency and collapse under peak) nor have humans review everything (15K reviewers can touch ~1–2% of the firehose). The right shape is a multi-stage pipeline: a fast synchronous known-bad hash gate that hard-blocks the worst content before publish, an asynchronous ML classifier fan-out over optimistically-published content that takes things down when flagged, and a prioritized human-review queue ranked by severity × reach.
Full checklist + reference answer
async-pipelineml-inferenceperceptual-hashingpriority-queue
View problem5 stages
hardSystem designPro
45m
Design a RAG Pipeline (AI Answer Engine)
Design a Retrieval-Augmented Generation system that answers natural-language questions over a large private corpus. The system ingests and indexes documents, retrieves the most relevant passages for each question, and has an LLM generate a grounded, cited answer. It must stay fresh as documents change, keep answers faithful to the corpus, control cost, and isolate thousands of tenants — at 50M documents, 500M vectors, and 5K queries per second.
Full checklist + reference answer
ragvector-searchembeddingsllm
View problem5 stages
hardSystem designPro
45m
Design a Twitter/X Timeline (Home Feed)
Design the home timeline for a social network like Twitter/X. Users follow hundreds of accounts and, on opening the app, should instantly see a fresh feed of those accounts' recent posts. The hard part is fan-out: a post from an account with tens of millions of followers must reach all of their timelines without melting the system or slowing everyone else down. Scale: 500M DAU, 2B users, 400M tweets/day, sub-200ms timeline loads.
Full checklist + reference answer
fan-outtimelinesocial-graphcaching
View problem5 stages
hardSystem designPro
45m
Design Ride Dispatch (Uber/Lyft)
Design the dispatch core of a ride-hailing service. Drivers stream GPS continuously; a rider requests a ride and the system finds nearby available drivers, matches one, and tracks the trip. The hard parts are the firehose of location updates (~1M+/sec) and the geospatial query — given a point, find the closest available drivers in milliseconds — at city scale, in dense areas, without ever handing the same driver to two riders.
Full checklist + reference answer
geospatialmatchingreal-timequadtree-geohash
View problem5 stages
hardSystem designPro
45m
Design Video Streaming (YouTube)
Design the video platform behind YouTube. A creator uploads a large, possibly flaky video file; the system ingests it reliably, transcodes it into multiple adaptive-bitrate renditions, and streams it smoothly to billions of viewers worldwide. The two hard parts are the offline transcoding pipeline and serving an enormous read volume of video bytes cheaply — overwhelmingly a CDN and storage problem. Scale: 2B users, ~500 hours uploaded/min, ~1B watch-hours/day.
Full checklist + reference answer
videotranscodingcdnblob-storage
View problem5 stages
hardSystem designPro
45m
Design a Distributed Job Scheduler
Design a distributed job scheduler (cron-as-a-service). Users submit one-time and recurring jobs that must fire close to their scheduled time and execute via a worker fleet — without being lost, without silently double-running, and surviving worker and scheduler crashes. The hard parts are finding which jobs are due efficiently at scale and guaranteeing once-ish execution despite failures and round-time spikes when millions of jobs are all scheduled for the top of the hour. Scale: 100M+ jobs, ~10K submissions/sec.
Full checklist + reference answer
schedulingat-least-oncequeueleases
View problem5 stages
hardSystem designPro
45m
Design an Ad Click Aggregator
Design the analytics backbone for an ad network. Every ad click generates an event; the system must ingest these at enormous volume, aggregate them by ad over time windows, and serve the results both in near-real-time for dashboards and accurately for billing. The hard parts are the write firehose (millions of events/sec), getting counts right despite duplicate and out-of-order events, and answering aggregation queries fast without scanning raw events. Scale: billions of clicks/day, ~1M events/sec.
Full checklist + reference answer
stream-processingaggregationevent-logolap
View problem5 stages
hardSystem designPro
45m
Design a Distributed Cache (Redis Cluster / DynamoDB)
Design a distributed in-memory key-value store / cache (like Redis Cluster, Memcached, or DynamoDB's storage layer) that partitions data across many nodes so the dataset exceeds one machine's memory, replicates each partition for availability, and stays fast (single-digit-millisecond) while surviving node failures and rebalancing as the cluster grows. The hard parts are choosing a partitioning scheme that doesn't reshuffle the world when a node joins, a replication + consistency model that trades off latency vs durability, and handling hot keys that overwhelm a single node.
Full checklist + reference answer
consistent-hashingreplicationpartitioningconsistency
View problem5 stages