Designing Netflix
Streaming a curated catalog — pre-encoded content, an ISP-embedded CDN for efficiency, personalized recommendations, and graceful global delivery.
The problem
Design Netflix: stream a curated catalog of movies/shows to millions concurrently, with great quality, instant start, and strong recommendations. Unlike YouTube, there are no user uploads — a known, finite catalog you fully control. That changes the design: you pre-encode everything optimally and obsess over delivery efficiency and personalization.
Step 1 — Requirements
Functional: browse/search a catalog; stream with adaptive quality; personalized recommendations and rows; resume across devices; profiles; downloads.
Non-functional: massive read bandwidth (video is ~15% of global internet traffic), low startup + zero rebuffering, high availability, global reach, personalization.
Step 2 — The catalog is known: pre-encode everything
Because the catalog is finite and controlled, encode each title ahead of time into many renditions — and do it per-title optimally (per-title/per-scene encoding tunes bitrate to the content, saving bandwidth at the same quality). Store all renditions as ABR segments (HLS/DASH) ready to serve. No real-time transcoding pressure like YouTube.
Step 3 — Delivery: a custom CDN (Open Connect)
Delivery efficiency is the whole game at this bandwidth. Netflix runs Open Connect — its own CDN with appliances embedded inside ISPs (and IXPs):
- Popular content is pre-positioned onto these appliances during off-peak hours.
- A viewer streams from an appliance inside their ISP — minimal backbone traffic, lowest latency, huge cost savings.
- The control plane steers each client to the best appliance (health, fill, proximity).
This “push content to the edge ahead of demand” is the key insight versus a pull CDN.
Step 4 — Adaptive streaming
Same ABR as YouTube: short segments at multiple bitrates; the player starts low for instant playback and adapts per segment to bandwidth — no rebuffering. Downloads cache encrypted segments for offline.
Step 5 — Recommendations (a core product)
Netflix is famously recommendation-driven — most viewing comes from recommendations, not search. A pipeline (candidate generation + ML ranking over viewing history, similarity, context) personalizes rows and ranking per profile; even artwork/thumbnails are personalized and A/B tested.
Step 6 — Architecture (control vs data plane)
client → API (AWS): browse, search, recommendations, playback authz, bookmarks
→ Open Connect (CDN in ISPs): the actual video segments
Netflix famously splits a control plane in the cloud (microservices: catalog, recommendations, billing, playback authorization) from the data plane on Open Connect (the bytes). The cloud decides what and where; the appliances deliver.
Step 7 — Resilience
Netflix pioneered chaos engineering (Chaos Monkey) — deliberately killing instances to prove the system degrades gracefully. Microservices with circuit breakers, fallbacks, and multi-region failover keep streaming up even when components fail.
Trade-offs to raise
- Pre-encode optimally (storage, no realtime compute, best quality/bitrate) vs on-the-fly (YouTube’s path for unknown uploads). Curated catalog → pre-encode.
- Custom embedded CDN (huge efficiency, operational complexity) vs third-party CDN.
- Recommendation-driven (engagement) vs search-driven discovery.
The interview cue
“Finite catalog → pre-encode per-title-optimally into ABR segments; deliver via a custom CDN embedded in ISPs (Open Connect) with content pushed to edges off-peak; a cloud control plane (microservices) handles browse/recommendations/playback authz while appliances serve bytes; ML recommendations drive discovery; chaos-tested resilience.” Pre-encoding + edge-embedded CDN + control/data split is the distinguishing answer; implementation next.