intermediatedeep dive

Protocol choice

REST, gRPC, GraphQL, WebSocket, SSE, and async protocols by caller and workload.

~15 min read

REST for humans, gRPC for services, GraphQL for views, async for anything that takes more than a second. Most candidates default to "REST everywhere" — that's fine until it isn't, and the interviewer will find the seam.

Read this if your last attempt…

You picked REST without considering the alternatives
You can't explain when gRPC wins and when REST wins
You'd use GraphQL "because it's modern"
You don't have an async escape hatch for work > 1s

The concept

Protocol choice is a fit question, not a fashion one. Match the protocol to:

Who the consumer is — browsers default to HTTP/JSON (REST or GraphQL); services can do anything; mobile apps often want GraphQL to reduce round-trips on low-bandwidth connections.
How the payload evolves — schema-typed protocols (gRPC, GraphQL) enforce compatibility checks; schemaless JSON is flexible but relies on discipline.
How latency-sensitive the path is — gRPC over HTTP/2 with protobuf is 5–10× faster to encode/decode than JSON and supports streaming; REST is slower but debuggable with curl.
Whether the work fits in the response time — any operation over ~1s should be async (enqueue → return job id → poll or notify).

Architecture diagram· Pick by consumer type and payload shape

REST for browsers and humans; gRPC for service-to-service; GraphQL when clients have vastly different view needs; async for anything slow.

Protocol fit at a glance.

REST/JSON	gRPC	GraphQL	Async
Primary consumer	Browsers, curl-able clients	Services in a cluster	Mobile + web with divergent views	Any consumer (eventually)
Latency overhead	Medium (text parsing)	Low (binary protobuf + HTTP/2)	Medium + resolver cost	N/A (async)
Schema enforcement	OpenAPI (external discipline)	Strong (protobuf, breaking change tooling)	Strong (SDL + introspection)	Whatever you pick on the wire
Caching	HTTP caching at CDN/LB	Per-call, no HTTP cache	Hard (persisted queries help)	Client/session-specific
Streaming	SSE, chunked responses	Native bidirectional	Subscriptions (over WS)	Native
Best for	Public APIs, browser apps	Low-latency internal calls	Unifying many client views	Long work, push, events

How interviewers grade this

You name the protocol per hop (client → edge: REST; service → service: gRPC) and justify each.
You call out schema evolution explicitly (protobuf/GraphQL SDL for typed, OpenAPI + semver for REST).
You have an async escape hatch for long work and name the completion channel (polling, webhook, SSE, WebSocket).
You account for real-world pain — gRPC from browsers, GraphQL's caching issues, REST's under-/over-fetching.
You state the payload shape and size budget (compressed JSON vs protobuf vs unstructured blob).

Variants

REST / HTTP-JSON default

HTTP verbs + JSON bodies. Debuggable, cacheable, boring — in the good sense.

The honest default for 80% of designs. Everyone has tooling; every language has a client; every proxy understands it. Pair with OpenAPI for schema discipline and cache-control headers for edge caching.

Pros

+Universal tooling, trivial to debug
+HTTP caching at LB/CDN works out of the box
+Clean mapping to CRUD

Cons

−Verbose payload (text JSON)
−No native streaming
−Schema discipline requires external effort

Choose this variant when

Public-facing APIs
Anything a browser talks to directly
Default pick when no constraint argues otherwise

gRPC for service-to-service

Protobuf + HTTP/2 + code-gen. Binary, fast, strongly typed.

Where REST is verbose and untyped, gRPC is compact and contractual. Code generation means calling a service is like calling a local method; breaking-change tooling catches incompatible schema edits before they ship.

Pros

+5–10× encode/decode speedup over JSON
+Bidirectional streaming is native
+Strong schema enforcement via protobuf + buf/protolock

Cons

−Awful from browsers (needs gRPC-Web gateway)
−Not curl-able — needs grpcurl or BloomRPC
−Steeper operational learning curve

Choose this variant when

High-QPS internal RPC
Mesh or sidecar setups
Latency-sensitive inter-service paths

GraphQL gateway

One endpoint; client asks for the exact shape it wants.

Use when multiple client types (web, iOS, Android, watch) need overlapping but distinct views of the same graph. Hides a fan-out across many backends behind one request. Pair with persisted queries to recover cacheability.

Pros

+Eliminates under-/over-fetching
+Strong schema with introspection
+Unifies many backends into one surface

Cons

−Hard to cache (each query is unique unless persisted)
−N+1 resolver trap without DataLoader
−Field-level authorization is subtle

Choose this variant when

Multiple distinct clients over shared data
Mobile-heavy products where round-trips hurt
BFF (backend-for-frontend) patterns

Worked example

Scenario: Designing a ride-sharing app (rider app, driver app, dispatch service, billing service, analytics).

Client ↔ our system (rider & driver mobile apps):

GraphQL gateway. Rider app and driver app share 70% of the schema (rides, users, locations) but need different shapes. GraphQL lets one endpoint serve both without duplicate REST endpoints.
Persisted queries: clients compile queries to hashes at build time; server only executes hashes on the allowlist. Gets us back the CDN-cacheability we lost and locks out arbitrary-query abuse.

Service ↔ service (dispatch ↔ pricing ↔ billing):

gRPC. Dispatch makes 5–10 internal calls per ride assignment; every millisecond matters. Protobuf saves ~5× encode/decode vs JSON; HTTP/2 multiplexing reuses connections.

Real-time driver location → rider app:

WebSocket stream from the rider's connection through the gateway to a location-stream service. SSE would work too; WebSocket is bidirectional which helps when the rider starts a chat.

Slow / variable-latency ops (end-of-ride billing, receipt email, loyalty-points calculation):

Queue. Ride completes → publish ride.completed. Billing, email, loyalty consumers each subscribe independently. Rider app doesn't wait for any of these.

Payment (external partner):

REST + idempotency key. The partner's API is REST; we honour it. Schema evolution relies on their OpenAPI.

The one-line justification per hop:

"GraphQL at the app edge because rider and driver share 70% of the graph with different view shapes."
"gRPC between dispatch and pricing because p99 matters and both are in our cluster."
"WebSocket for driver locations because push + bidirectional."
"Queue for billing because it can take 2–5 s and the rider shouldn't wait."
"REST for the external payment partner — their contract dictates ours."

Good vs bad answer

Interviewer probe

“Your services talk to each other 50 times per user request. REST or gRPC?”

Weak answer

"REST, because it's the standard."

Strong answer

"gRPC. At 50 calls per request with REST/JSON, you'd pay ~1–3 ms per call on encode + parse alone — that's 50–150 ms of pure serialization overhead, before any network or business logic. gRPC with protobuf is 5–10× cheaper on the wire and HTTP/2 reuses one connection across all 50 calls. I'd keep REST at the public edge for browser clients and schema visibility, but service-to-service is firmly gRPC. The trade-off is worse debuggability — grpcurl is fine but not curl — so we invest in service-mesh tracing and a gRPC reflection endpoint in non-prod."

Why it wins: Quantifies the cost (ms per call × call count), names the specific gains (protobuf + HTTP/2 multiplexing), picks the right protocol per layer (edge vs internal), and acknowledges the cost (debuggability).

Interview playbook2–4 min during API design; revisited when a hot path appears

When it comes up

Designing the client-facing API surface — "how do clients talk to this?"
Drawing service-to-service calls inside the cluster
Mobile clients, or multiple client types over the same data
A call path the interviewer flags as latency-sensitive or "slow"

Order of reveal

1
1. REST at the edge. Public, browser-facing surface is REST/JSON — universal tooling, cacheable at the CDN/LB, trivially debuggable.
2
2. gRPC between services. Inside the cluster where QPS and p99 matter, gRPC over HTTP/2 with protobuf — 5–10× cheaper encode/decode and connection multiplexing.
3
3. GraphQL only for divergent views. If web, iOS, and Android need different shapes of the same graph, a GraphQL gateway removes per-client endpoints. I would not adopt it for a single client.
4
4. Async escape hatch. Anything over ~1s or any server→client push goes async — enqueue, return a job id, then poll / webhook / SSE / WebSocket.
5
5. Name the schema-evolution story. protobuf + buf for gRPC, SDL diff for GraphQL, OpenAPI + CI checks for REST — so a breaking change is caught before it ships.

Signature phrases

“REST at the edge, gRPC between services, async for anything over a second.”

“Browsers never speak raw gRPC — that is REST or GraphQL.”

“At 50 internal calls per request, protobuf + HTTP/2 multiplexing pays for itself.”

“Long work does not belong in a synchronous response.”

“REST at the edge, gRPC between services, async for anything over a second.” — A one-line policy that shows you choose per layer, not one-size-fits-all.
“Browsers never speak raw gRPC — that is REST or GraphQL.” — Catches the classic gRPC-from-the-browser trap.
“At 50 internal calls per request, protobuf + HTTP/2 multiplexing pays for itself.” — Quantifies why gRPC wins on the hot internal path.
“Long work does not belong in a synchronous response.” — Signals you design the async shape deliberately instead of risking 504s.

Likely follow-ups

?“Why not GraphQL everywhere?”Reveal

GraphQL earns its weight only when multiple clients need genuinely different views of a shared graph. Its costs are real: HTTP caching is hard (each query is unique unless you use persisted queries), the N+1 resolver trap needs DataLoader batching, and field-level authorization gets subtle. For one web client on one backend it is complexity tax — REST is the better default.

?“You need server→client push. WebSocket or SSE?”Reveal

SSE if it is one-way server→client in a browser — it survives proxies, auto-reconnects, and rides plain HTTP. WebSocket when you need bidirectional (chat, collaborative editing) or a native mobile client where SSE support is weaker. I would not reach for WebSocket if SSE covers the requirement — it is a cheaper connection to operate.

?“How do you evolve a gRPC contract without breaking callers?”Reveal

protobuf is forward/backward compatible if you respect field numbers: only ever add fields with new tags, never reuse or renumber a tag, and never change a field type. Make removed fields reserved so the tag cannot be recycled. Enforce it in CI with buf or protolock breaking-change detection, so an incompatible edit fails the build rather than a downstream service at runtime.

Common mistakes

"REST everywhere" as default without justification

REST is a good default, but using it for high-QPS internal RPC leaves measurable latency on the table. Be able to defend the choice against gRPC at the service-to-service layer.

GraphQL because "it's modern"

GraphQL shines when you have multiple divergent client views. For a single web client hitting a single backend, it's complexity tax. Justify the fit.

No async path for work > 1s

Pretending a 3-second operation can live in a synchronous request means clients see timeouts and 504s. Design the async shape (job id + poll/webhook/SSE) before the long operation ships.

gRPC direct from browsersAdvanced

Browsers can't speak raw gRPC — only gRPC-Web, which loses bidirectional streaming and needs a proxy (Envoy, Nginx). For browser clients, REST or GraphQL; reserve gRPC for cluster-internal.

Ignoring schema evolutionAdvanced

Any protocol without a schema-breaking-change story accumulates silent compatibility bugs. gRPC has buf/protolock; GraphQL has schema diff + deprecation; REST needs OpenAPI + CI checks. Pick one per protocol and enforce it.

Practice drills

You need server→client push for a dashboard. WebSocket, SSE, or long-polling?Reveal

SSE if the dashboard is read-only from the server and runs in a browser — one-way, survives proxies, automatic reconnect. WebSocket if you need bidirectional (chat, collaborative editing) or you're pushing to a native mobile app where SSE support is weaker. Long-polling only as a fallback for environments that block SSE/WebSocket (rare now). For a server-push dashboard specifically: SSE.

Interviewer: "we need to update 20 fields on a user record from a mobile form. REST or GraphQL?"Reveal

Either works, but the question hints at under-fetching/over-fetching. If this is the only such form, REST with a PATCH /users/:id is fine — simple, cacheable at the LB. If you have many forms across many clients each needing different field sets, GraphQL pays for itself by removing client-specific endpoints. Don't adopt GraphQL for one form; do adopt it when clients routinely need different view shapes of the same graph.

You're profiling an internal service call path and see 200 ms spent in JSON encode + decode alone. What do you propose?Reveal

If this is a hot internal path, migrate to gRPC with protobuf — expect 5–10× improvement on encoding cost alone. If migration is politically expensive short-term, intermediate wins are: (1) switch to a faster JSON library (simdjson, sonic); (2) MessagePack or CBOR over HTTP for binary encoding without a full protocol change; (3) batch multiple calls into one to amortise encode cost. Long-term answer is still gRPC.

Cheat sheet

•Default: REST at the edge. gRPC between services. Async for anything > 1s.
•Browser touches it? REST or GraphQL. Never raw gRPC.
•Service-to-service at high QPS? gRPC wins on encode + HTTP/2 multiplexing.
•Multiple divergent clients over shared data? GraphQL earns its weight.
•Schema evolution: protobuf + buf, GraphQL SDL + diff, OpenAPI for REST. Non-negotiable.
•Streaming: SSE for server→client, WebSocket for bidirectional, gRPC streams between services.
•Payload: protobuf ≪ msgpack < CBOR < JSON on size; choose with intent, not habit.

Practice this skill

These problems exercise Protocol choice. Try one now to apply what you just learned.

chat system

Read this if

REST/JSON

gRPC

GraphQL

Async

Primary consumer

Browsers, curl-able clients

Services in a cluster

Mobile + web with divergent views

Any consumer (eventually)

Latency overhead

Medium (text parsing)

Low (binary protobuf + HTTP/2)

Medium + resolver cost

N/A (async)

Schema enforcement

OpenAPI (external discipline)

Strong (protobuf, breaking change tooling)

Strong (SDL + introspection)

Whatever you pick on the wire

Caching

HTTP caching at CDN/LB

Per-call, no HTTP cache

Hard (persisted queries help)

Client/session-specific

Streaming

SSE, chunked responses

Native bidirectional

Subscriptions (over WS)

Native

Best for

Public APIs, browser apps

Low-latency internal calls

Unifying many client views

Long work, push, events