Skip to content
System design course
Ch.4 · Designing real systems·how to build it ·7 min read

Building Uber

Implement the in-memory location index updated per ping, lease-based dispatch that prevents double-matching, the trip state machine, and live tracking.


Updating driver locations (the write firehose)

Each ping moves a driver between cells in an in-memory index; history streams off to storage:

def on_location(driver_id, lat, lng):
    cell = h3.geo_to_cell(lat, lng, RES)
    old = driver_cell.get(driver_id)
    if old != cell:
        geo_index.remove(old, driver_id)            # leave old cell
        geo_index.add(cell, driver_id)              # join new cell
        driver_cell[driver_id] = cell
    driver_loc[driver_id] = (lat, lng, now())       # latest position (in memory)
    location_stream.publish(driver_id, lat, lng)    # history → storage / analytics

Only active drivers are in the index; updates are O(1) cell moves in memory (Redis or a sharded location service) — the DB never sees per-ping writes.

Matching candidates by cell

def find_candidates(lat, lng):
    cell = h3.geo_to_cell(lat, lng, RES)
    cells = [cell] + h3.neighbors(cell, k=2)        # nearby rings
    drivers = []
    for c in cells:
        drivers += [d for d in geo_index.get(c) if status[d] == "available"]
    return drivers

Lease-based dispatch (no double-matching)

Offer the best driver and lease them so concurrent requests can’t grab the same driver, and the driver isn’t offered two rides:

def dispatch(ride_request):
    candidates = find_candidates(ride_request.lat, ride_request.lng)
    ranked = sorted(candidates, key=lambda d: eta(d, ride_request))   # ETA, not raw distance
    for driver in ranked:
        if lease.acquire(f"driver:{driver}", ttl="15s"):   # atomic claim
            if offer(driver, ride_request, timeout="10s"):  # driver accepts?
                start_trip(ride_request, driver)
                return driver
            lease.release(f"driver:{driver}")               # declined → next candidate
    return None                                             # no driver found → retry/surge

The lease (atomic, TTL’d — the distributed-lock pattern) guarantees one rider ↔ one driver even under concurrent dispatch.

The trip state machine (consistent)

Trip state is transactional and idempotent — never lose a trip or double-charge:

TRANSITIONS = {"requested": "matched", "matched": "driver_arriving",
               "driver_arriving": "in_progress", "in_progress": "completed",
               "completed": "paid"}

def advance(trip_id, event):
    with txn():                                     # transactional update
        trip = trips.get(trip_id)
        expected = TRANSITIONS.get(trip.state)
        if event.target == expected and not trip.applied(event.id):  # idempotent
            trip.state = event.target
            trips.put(trip_id, trip)
            if trip.state == "completed": payments.charge(trip)      # exactly-once charge

Live tracking

Rider and driver apps subscribe (WebSocket/pub-sub) to trip updates; the driver’s location stream is forwarded to the matched rider so they see the car move in real time — reusing the location stream + a pub/sub channel keyed by trip.

ETA and surge

  • ETA — call the routing service (road graph + live traffic, next lesson), cached per region; recompute as the trip moves.
  • Surge — a job continuously computes demand/supply per cell from the request and location streams and publishes a multiplier the pricing service reads.

Scale and failure handling

  • Location writes → in-memory, sharded by cell/region; history async to storage.
  • Geo index HA → replicated; on node loss, drivers re-ping and repopulate quickly.
  • Driver declines / times out → lease expires, next candidate offered.
  • No drivers nearby → widen radius, queue the request, raise surge to attract supply.
  • Trip/payment → transactional + idempotent so crashes don’t double-charge or lose trips.
  • Region hotspot (event letting out) → finer cells + more matching workers; surge balances demand.

The takeaway

Concrete signals: an in-memory cell index updated per ping (DB spared the firehose), lease-based dispatch so no driver is double-matched, ETA-ranked matching, and a transactional idempotent trip state machine for correctness. Real-time moving-entity indexing + lease matching is the reusable core (DoorDash is the same shape with restaurants and couriers).