Caching strategies
Keep hot data close to readers to cut latency and offload the database — plus the eviction and invalidation problems caching hands you back.
Why caching is the highest-leverage move
Most consumer systems are read-heavy, and most reads ask for the same small “hot” set of data. A cache keeps that hot data in fast storage (memory) close to where it’s read, so the common request never touches the slow, contended database. It’s usually the single biggest latency-and-cost win in a design — and the first thing to reach for when an estimate shows heavy reads.
Where caches live
Caching happens at every layer, and you’ll often use several:
- Client / browser — avoid the request entirely.
- CDN — cache static assets near users geographically (its own lesson in Chapter 3).
- Application / distributed cache — a shared in-memory store like Redis or Memcached, sitting between app servers and the database. This is the one interviews usually mean.
- Database cache — the DB’s own buffer pool.
Read patterns
- Cache-aside (lazy loading) — the app checks the cache; on a miss it reads the DB, stores the result, and returns it. Simple and common; only requested data is cached. Downside: the first request for each key is always a miss, and stale data can linger until eviction.
- Read-through — the app always asks the cache, and the cache fetches from the DB on a miss. Cleaner app code; the cache library owns the loading logic.
Write patterns
How you keep the cache and database from disagreeing:
- Write-through — write to cache and DB together (synchronously). Reads after a write are fresh; writes pay extra latency.
- Write-back (write-behind) — write to cache, flush to DB asynchronously. Fast writes and great for write-heavy bursts, but a cache crash before flush loses data.
- Write-around — write straight to the DB and skip the cache; data is cached only when later read. Avoids filling the cache with write-once data.
(Read-through vs write-through is examined as a trade-off in Chapter 3.)
Eviction: the cache is finite
When the cache fills, something must go. Common policies:
- LRU (least recently used) — evict what hasn’t been touched longest. The sensible default.
- LFU (least frequently used) — evict the least-accessed; keeps genuinely popular items.
- FIFO / TTL — evict oldest, or expire entries after a fixed time.
TTLs are your friend: even without explicit invalidation, a short expiry caps how stale data can get.
Invalidation: the genuinely hard part
“There are only two hard things in computer science: cache invalidation and naming things.”
When the underlying data changes, stale cache entries must be updated or removed. Strategies: write-through (update on write), TTL (let it expire), explicit invalidation (delete the key on update, often via an event). The trade-off is staleness vs cost: aggressive invalidation keeps data fresh but adds work and complexity; lazy TTLs are cheap but serve stale reads.
The failure modes to name
- Thundering herd / cache stampede — a hot key expires and thousands of requests hit the DB at once. Mitigate with request coalescing, staggered TTLs, or refreshing slightly before expiry.
- Cold cache — after a restart the cache is empty and the DB takes the full load; warm it or ramp traffic.
Bringing up invalidation and stampedes unprompted shows you’ve actually run a cache, not just drawn one.