Designing Uber · Lyte Code

Match riders to nearby drivers in real time — tracking millions of moving locations, the dispatch algorithm, trip lifecycle, ETAs, and surge pricing.

The problem

Design Uber/Lyft: a rider requests a ride and is matched to a nearby available driver in seconds, with live tracking, ETAs, and dynamic pricing. It extends Yelp’s geospatial search with constantly moving drivers and a real-time matching problem — the hard, distinctive part.

Step 1 — Requirements

Functional: drivers stream their location; a rider requests a ride; match to a nearby available driver; track the trip live; compute ETA/route; fare + surge pricing; ratings.

Non-functional: low-latency matching (seconds), high write volume (millions of drivers pinging location every few seconds — write-heavy), real-time updates, availability, consistency for trip/payment state.

Step 2 — Tracking moving drivers (geospatial, but live)

Drivers send location every ~4 seconds → an enormous write stream. You need a geo index that supports frequent updates and nearby queries:

Index active drivers by geohash/quadtree/S2 cell (Uber uses H3 hexagons).
On each location ping, update the driver’s cell (remove from old cell, add to new) — kept in an in-memory store (Redis geo / a dedicated location service) for speed; the full history streams to storage for analytics.
The index holds only currently-active drivers, so it’s bounded by drivers online, not total drivers.

Step 3 — Matching / dispatch (the core)

When a rider requests:

Find candidates — query the rider’s cell + neighbors for available drivers within range.
Rank — by ETA (not just straight-line distance — real road/traffic time), driver rating, direction, etc.
Dispatch — offer to the best driver; if they decline/time out, offer the next. Use a lock/lease so a driver isn’t double-dispatched (reuse the distributed-lock idea), and so two riders don’t match the same driver.

This is a real-time, write-heavy matching loop — the heart of the design.

Step 4 — Trip lifecycle (stateful, consistent)

A trip is a state machine: requested → matched → driver_arriving → in_progress → completed → paid. Trip state needs stronger consistency (you can’t lose a trip or double-charge) — store it in a transactional store, update via events, and make transitions idempotent.

Step 5 — ETA and routing

ETAs use a routing engine over a road graph with live traffic (shortest path weighted by current speeds — see the Maps lesson). Precompute/region-cache common routes; recompute ETA as the trip progresses.

Step 6 — Surge pricing

When demand exceeds supply in an area, raise prices to balance: compute a surge multiplier per region/cell from the live supply/demand ratio (open requests vs available drivers), updated continuously from the same location/request streams.

Step 7 — Architecture

driver app → location stream → in-memory geo index (active drivers, by cell)
rider request → matching service → nearby+available drivers → rank by ETA → dispatch (lease)
            → trip service (state machine, transactional) → payments
ETA ← routing/maps service;  surge ← demand/supply per cell;  live tracking ← pub/sub

Trade-offs to raise

In-memory geo index (fast updates/queries, must be HA) vs durable DB (too slow for per-ping updates). Keep live locations in memory; stream history to storage.
Match by distance (cheap) vs ETA (accurate, needs routing). ETA is better; cache routing.
Consistency split — locations/matching are AP-ish (speed); trip/payment state is CP (correctness).
Dispatch fairness vs speed — greedy nearest vs batched optimal matching.

The interview cue

“Active drivers indexed in an in-memory geo index (H3/geohash cells) updated on each location ping; a rider request queries nearby cells for available drivers, ranks by ETA, and dispatches with a lease so none is double-matched; the trip is a transactional state machine; ETA from a routing service; surge from live supply/demand per cell.” Real-time moving-driver index + lease-based matching is the distinguishing answer; implementation next.