Building a content delivery network
Implement the edge cache and its eviction, the miss path through the hierarchy with request coalescing, and a purge control plane that fans out to every PoP.
The edge cache
Each edge server is an LRU cache keyed by URL, with per-object TTL and metadata (ETag, size). The serve path:
def handle(request):
key = cache_key(request.url, request.vary_headers)
entry = cache.get(key)
if entry and not entry.expired():
return entry.response # HIT
if entry and entry.expired():
return revalidate_or_serve_stale(entry) # stale-while-revalidate
return fetch_through_hierarchy(key, request) # MISS
LRU keeps the hot tail in finite edge storage; large media is stored as segments so a single GB object doesn’t evict everything.
The miss path + request coalescing
On a miss the edge asks its parent → origin shield → origin. The critical detail: if 10,000 users request the same uncached hot object at once, you must not send 10,000 origin fetches (a stampede). Coalesce concurrent misses for the same key into one upstream request:
inflight = {} # key -> Future
def fetch_through_hierarchy(key, request):
if key in inflight:
return inflight[key].result() # join the in-flight fetch
fut = inflight[key] = executor.submit(fetch_from_parent, key, request)
try:
resp = fut.result()
cache.put(key, resp, ttl=resp.cache_ttl())
return resp
finally:
del inflight[key]
The hierarchy + coalescing is why origin sees a tiny fraction of edge traffic.
Routing (the control plane’s job)
Geo-DNS or anycast (design lesson) maps users to PoPs. For geo-DNS, the authoritative resolver picks a PoP by client geo, health, and load, returning a short-TTL record so it can reroute quickly when a PoP degrades. Health is fed by heartbeats from each PoP.
Purge: fanning invalidation to every edge
When content changes under the same URL, a purge must reach all PoPs. A control plane publishes invalidations over a fast fan-out (pub/sub):
# control plane
def purge(url):
msg = {"op": "purge", "key": cache_key(url), "ts": now()}
pubsub.publish("cdn.invalidations", msg) # every edge subscribes
# each edge
def on_invalidation(msg):
cache.delete(msg["key"]) # or mark stale (soft purge)
Purge is eventually consistent — it takes seconds to reach thousands of edges. For correctness-critical changes, prefer versioned URLs (a new URL is never stale) over purging the old one.
Consistency and freshness knobs
- TTL per object (from origin headers) bounds staleness automatically.
- ETag / If-None-Match revalidation — edge asks origin “still valid?”; a cheap
304refreshes TTL without re-transferring the body. - Stale-while-revalidate / stale-if-error — serve the stale copy instantly while refreshing in the background, or when origin is down (availability win).
Scaling and failure handling
- PoP failure — anycast withdraws the route / DNS health-check reroutes to the next-nearest PoP; users barely notice.
- Origin failure — serve stale (
stale-if-error) so the site stays up on cached content. - Hot object — handled by coalescing + the cache itself; extremely hot content can be pinned.
- Cache hit ratio is the KPI — monitor it per PoP; a drop signals a TTL or routing problem.
The takeaway
The implementation crux: an LRU edge cache with stale-while-revalidate, a miss path through a hierarchy with request coalescing to shield origin, a pub/sub purge fan-out (with versioned URLs as the cleaner alternative), and anycast/geo-DNS failover. These reuse Chapter 2’s caching, consistent hashing, and heartbeats directly — and the segmentation here powers the video systems later.