Designing Shopify (multi-tenant storefronts)
A platform hosting millions of independent stores — multi-tenancy models, tenant isolation, customization, and noisy-neighbor protection.
The problem
Design Shopify: a platform where millions of merchants each run their own store (catalog, checkout, admin) on shared infrastructure. The e-commerce mechanics are the previous lesson; the distinctive challenge here is multi-tenancy — serving many isolated tenants efficiently, safely, and customizably.
Step 1 — Requirements
Functional: merchants create stores (products, themes, domains); buyers shop each store; per-store checkout/orders/inventory; merchant admin/analytics; apps/extensions.
Non-functional: tenant isolation (one store can’t see/break another), scalability (millions of stores, very uneven sizes), customizability per store, availability (a flash sale on one store mustn’t take down others — noisy-neighbor), cost efficiency.
Step 2 — The multi-tenancy model (the core decision)
How do tenants share data infrastructure? A spectrum:
- Shared database, shared schema — all tenants in the same tables with a
tenant_idcolumn; every query filters by tenant. Cheapest/densest, but weakest isolation (a missing filter leaks data — a real risk) and noisy-neighbor at the DB. - Shared database, separate schema — a schema per tenant in one DB. Better isolation, more overhead.
- Separate database per tenant — strongest isolation and per-tenant scaling/backup, but expensive and operationally heavy at millions of tenants.
Real platforms use a hybrid / pod model: group tenants into shards (pods) — each a self-contained stack (DB + services) hosting a subset of stores. A huge merchant can get a dedicated pod; small ones share. This balances density, isolation, and noisy-neighbor.
Step 3 — Routing requests to the right tenant
A request (by custom domain or subdomain) must resolve to its tenant and pod:
buyer hits store.com / shop.myshopify.com
→ routing layer maps domain → tenant_id → pod
→ request handled by that pod's services/DB
A tenant directory (domain → tenant → pod) drives routing; cached for speed.
Step 4 — Isolation and noisy neighbors
- Data isolation — enforce
tenant_idscoping rigorously (a framework-level guard, not per-query discipline) so cross-tenant leaks are impossible; or separate schemas/DBs. - Noisy-neighbor — one store’s flash sale shouldn’t starve others. Mitigate with per-tenant rate limits/quotas, pod-level isolation (blast radius = one pod), and moving heavy tenants to dedicated pods.
Step 5 — Customization
Each store is customizable (themes, domains, apps) without forking code: themes as data/templates rendered per request; an app/extension platform (webhooks + APIs) so third parties extend stores; per-tenant config. The code is shared; the data and templates differ.
Step 6 — Architecture
buyer → edge/routing (domain → tenant → pod) → pod (storefront, checkout, admin, DB)
merchant admin → same routing → pod services
platform services (cross-tenant): tenant directory, billing, app store, analytics
Trade-offs to raise
- Shared schema (dense, cheap, weak isolation) vs DB-per-tenant (isolated, expensive) vs pods (balanced). Pods are the pragmatic answer at scale.
- Density/cost vs isolation/blast-radius.
- Customizability (themes/apps) vs platform stability/security.
The interview cue
“Group millions of stores into pods (each a self-contained stack hosting a tenant subset; big merchants get dedicated pods); a routing layer maps domain → tenant → pod; tenant isolation is enforced at the framework level (or by schema/DB separation); per-tenant quotas + pod isolation contain noisy neighbors; themes + an app platform give customization on shared code.” The multi-tenancy/pod model + isolation is the distinguishing answer; implementation next.