Building a distributed cache
Implement a single node's LRU, route keys with a hash ring, replicate to followers, and defeat the cache stampede and hot-key problems.
A single node: O(1) LRU
Each cache node is an LRU map — a hash map plus a doubly-linked list so get/put and eviction are all O(1) (this is the classic LRU Cache design problem):
class LRUCache:
def __init__(self, capacity):
self.cap = capacity
self.map = {} # key -> node
self.head, self.tail = Node(), Node() # MRU near head, LRU near tail
self.head.next, self.tail.prev = self.tail, self.head
def get(self, key):
if key not in self.map: return None
node = self.map[key]; self._move_to_front(node)
return node.value
def put(self, key, value, ttl=None):
if key in self.map: self._remove(self.map[key])
node = Node(key, value, expire_at=now()+ttl if ttl else None)
self.map[key] = node; self._add_front(node)
if len(self.map) > self.cap: self._evict_lru() # drop tail.prev
TTL is checked lazily on get (treat expired as miss) plus a background sampler that
actively expires a random subset (Redis-style) so dead keys don’t hog RAM.
Routing keys with the hash ring
The client maps each key to a shard via consistent hashing (Chapter 2). Adding a node moves only its ring neighborhood:
node = ring.pick(key) # bisect on the sorted vnode hashes
value = node.get(key)
Replication and failover
Each shard is a primary with async replicas. Writes go to the primary and stream to replicas; reads can hit either (accepting slight staleness). Membership and health are tracked by heartbeats/gossip; on primary death a replica is promoted and the ring is updated to point at it. Use a coordination service (or Redis Sentinel-style quorum) to agree on the promotion and avoid split-brain.
The cache stampede (must-handle)
When a hot key expires, thousands of concurrent misses can hit the DB at once. Three defenses (often combined):
def get_or_load(key, loader, ttl):
v = cache.get(key)
if v is not None: return v
if not lock.acquire(key, nx=True, ttl=5): # 1) only ONE loader
return wait_for(key) or cache.get(key) # others wait/serve stale
try:
v = loader() # hit DB once
cache.put(key, v, ttl=ttl + jitter()) # 2) jittered TTL (staggered)
return v
finally:
lock.release(key)
Plus 3) early/background refresh: refresh a hot key slightly before it expires so it never goes cold.
The hot-key problem
Consistent hashing pins a key to one shard; a celebrity key can overload that one node. Mitigations:
- Replicate the hot key to multiple nodes and read a random replica.
- A tiny client-side local cache with a short TTL absorbs repeated reads before they leave the app server.
- Detect hot keys via per-key request counters and promote them automatically.
Consistency and invalidation
- Cache-aside writes: update the DB, then delete the cache key (let the next read repopulate) — simpler and safer than updating in place.
- Propagate invalidations via pub/sub if multiple caches/regions hold the key.
- Accept a brief stale window; for must-be-fresh fields, write-through instead.
Failure handling
- Node down → ring routes around it to the replica; affected keys briefly cold.
- Whole cache down → app falls back to the DB (degraded latency); protect the DB with the stampede lock so it isn’t crushed.
- Network partition → favor availability (serve possibly-stale) for a cache.
The takeaway
Concrete signals: an O(1) LRU node, consistent-hashing routing, primary/replica failover, and explicit defeat of the stampede (single-loader lock + TTL jitter + early refresh) and hot key (replicate/local-cache). This is the reusable caching layer behind nearly every read-heavy system in this chapter.