Skip to content
System design course
Ch.3 · Trade-offs that define a design·concept ·8 min read

Polling vs long-polling vs WebSockets vs webhooks

Four ways to move an event from where it happens to whoever cares — across client-server and server-server — with the cost and scale of each.


The problem: getting fresh events to whoever needs them

HTTP is pull-based, but most live features need a push. These four techniques span two axes: client↔server (the first three) and server↔server (webhooks). Chapter 2 introduced the client-server three; here we compare all four as a decision and add the server-to-server case.

Short polling

The client re-asks “anything new?” on a fixed timer.

  • Mechanics: GET /updates every N seconds; usually returns nothing.
  • Latency: up to one interval stale.
  • Cost: wasteful — most requests are empty; halving latency doubles request load. Each request re-pays connection/auth overhead.
  • Use when: updates are infrequent and a few seconds of lag is fine; you want dead-simple and stateless.

Long polling

The client requests; the server holds the connection open until it has data (or a timeout), responds, and the client immediately re-requests.

  • Mechanics: request parks on the server until an event or ~30s timeout.
  • Latency: near-instant delivery, far fewer empty responses than short polling.
  • Cost: holds a connection per waiting client (server must support many concurrent open requests); still one event per round trip; needs care so a proxy doesn’t kill the idle connection.
  • Use when: you want near-real-time with maximum compatibility and no special protocol.

WebSockets

One persistent TCP connection, upgraded from HTTP, carrying full-duplex messages either direction at any time.

  • Mechanics: ws:///wss://; after the handshake both sides send frames freely.
  • Latency: lowest; true push, minimal per-message overhead.
  • Cost: stateful long-lived connections — hard to load-balance (need sticky routing or a pub/sub backplane to fan messages to the right connection node), consume memory per connection, and need reconnection/heartbeat logic and fallbacks where blocked.
  • Use when: bidirectional and high-frequency — chat, multiplayer, collaborative editing, live trading. (SSE is the lighter, one-way cousin when the client only needs to receive.)

Webhooks — the server-to-server one

Instead of you polling another service, you register a URL and they POST to you when an event occurs. It’s “push” between servers (reverse of the others, which push to clients).

  • Mechanics: you expose POST /webhooks/payments; the provider calls it on each event (e.g. Stripe → “payment succeeded”).
  • Latency: event-driven, near-instant; zero wasted requests vs polling their API.
  • Cost: you must run a publicly reachable, always-up endpoint; handle retries and idempotency (deliveries can duplicate or arrive out of order), verify signatures (anyone could POST), and absorb bursts (queue incoming webhooks). Delivery isn’t guaranteed if you’re down — providers retry with backoff.
  • Use when: integrating with third-party/independent services for asynchronous events — payments, CI, repo events.

The decision table

NeedReach for
Rare updates, simplest possibleShort polling
Near-real-time, broad compatibilityLong polling
Two-way, high-frequency client trafficWebSockets
One-way server→client streamSSE (Chapter 2)
Async events between servicesWebhooks

The interview cue

Match the mechanism to the direction and frequency: “In-app chat is WebSockets (bidirectional, with a Redis pub/sub backplane so any connection server can deliver a message). Our payment provider notifies us via webhooks, which I’ll verify, make idempotent, and drop onto a queue. A rarely-changing status badge can just short-poll.” Picking per-channel — and naming the scaling cost (backplane) and delivery hazards (retries, idempotency, signatures) — is the signal.