The master design template

A reusable skeleton of components and questions you can stamp onto almost any "design X" prompt — and the recurring patterns that solve most of them.

One template, most systems

Most large-scale systems are assembled from the same kit of parts in slightly different arrangements. Internalize this skeleton and you’ll never face a blank board — you start from the template and adapt. Every “Designing X” lesson in this chapter is this template, specialized.

The reference architecture

            ┌── CDN (static/media) ──┐
 clients ──▶ DNS ──▶ load balancer ──▶ API gateway ──▶ services
                                                         │  │  │
                          ┌──────────────────────────────┘  │  └─────────────┐
                          ▼                                  ▼                ▼
                       cache (Redis)                  primary datastore   blob store
                          │                            (sharded + replicas)  (S3)
                          ▼                                  │
                    message queue ──▶ async workers ──▶ search index / warehouse

Not every system needs every box — but scanning this list ensures you don’t forget one.

The component checklist

Client — web/mobile; what it caches and computes locally.
DNS + CDN — routing and edge caching of static assets/media.
Load balancer — distribute across stateless app servers.
API gateway — auth, rate limiting, routing (for many services / public APIs).
Application services — stateless business logic; one service or several.
Cache — Redis/Memcached for hot reads and sessions.
Primary datastore — SQL or NoSQL; sharded and replicated as scale demands.
Blob/object store — S3-style storage for large files, images, video.
Message queue — Kafka/SQS to decouple and absorb spikes; async processing.
Search index — Elasticsearch for full-text/faceted search.
Data warehouse / stream — analytics, metrics, ML pipelines.
Coordination — ZooKeeper/etcd for leader election, config, locks (when needed).

The questions to ask every time

For each system, this small set of questions usually exposes the whole design:

Read-heavy or write-heavy? → caching/replicas vs sharding/queues.
What’s the hot path? → the one operation that must be fast and scalable.
What needs strong consistency vs eventual? → per-data-type choice.
What’s the biggest object/most data? → blob store, CDN, partitioning.
What’s bursty or async-able? → queues and background workers.
Where’s the hardest scaling problem? → that’s your main deep dive.

The recurring patterns

A handful of patterns solve most Chapter 4 problems — recognize which applies:

Fan-out on write vs read — precompute results at write time for cheap reads (feeds, timelines), or compute at read time to keep writes cheap. Often a hybrid split by case (e.g. celebrities).
CDN + blob store for media — never serve large files from app servers; upload to a blob store, serve via CDN, store only metadata in the DB.
Write path vs read path separation — ingest fast (queue + write-optimized store), serve fast (cache + read-optimized/denormalized view). Often CQRS-flavored.
Async via queues — anything slow or spiky (notifications, encoding, thumbnails, indexing) goes to a queue and background workers.
Sharding by the right key — pick a high-cardinality, evenly-accessed key (user id, object id) and keep related data co-located.
Unique ID generation — counters/ranges, Snowflake-style IDs, or hashing — recurs in many problems.
Search offloaded — keep the source of truth in the DB; replicate into a search index asynchronously.

How to use it

When you hear “design X,” silently run the six questions, sketch the relevant subset of the reference architecture, and identify which recurring pattern is the crux. Then go deep there. The remaining lessons each do exactly this for a specific system — read them as worked applications of this template, and you’ll start seeing the same moves everywhere.