System design course
Ch.3 · Trade-offs that define a design·concept ·6 min read
Hybrid-cloud vs all-cloud storage
Keep some data on-premises while using the cloud, or go fully cloud — a trade between control, compliance, and latency versus elasticity and lower operational burden.
The question
Where does your data physically live?
- All-cloud — everything in a cloud provider (S3, cloud databases). You rent storage and offload operations.
- Hybrid — a mix: some data on on-premises (or private) infrastructure you own, some in the public cloud, connected into one system.
All-cloud
- Wins: elasticity — effectively unlimited capacity on demand, pay for what you use; low operational burden — the provider handles hardware, durability, replication, backups; built-in global distribution and managed services; fast to start, no capital outlay.
- Costs: less control over exact location and hardware; egress fees and long-term cost can surprise at scale; vendor lock-in; compliance/data residency constraints may forbid certain data leaving a jurisdiction; reliance on the provider’s availability.
Hybrid
- Wins: control and compliance — keep regulated or sensitive data (financial, health, government) on infrastructure you own or in a specific jurisdiction; lower latency for systems that must sit near on-prem apps or factory/edge hardware; reuse existing data-center investment; avoid full lock-in; cloud-burst for spikes while keeping a steady baseline on-prem.
- Costs: complexity — two environments to integrate, secure, and keep consistent; networking between them (bandwidth, latency, VPN/direct-connect); you still operate the on-prem half; data synchronization and a coherent security model are genuinely hard.
The side-by-side
| All-cloud | Hybrid | |
|---|---|---|
| Elasticity | Highest | Cloud half only |
| Ops burden | Lowest | You run the on-prem half |
| Control / residency | Provider-bound | Strong (keep data where you must) |
| Latency to on-prem apps | Higher | Low (data stays local) |
| Complexity | Lower | Higher (two worlds) |
| Lock-in | Higher | Lower |
What usually drives the choice
It’s most often compliance, latency, and existing investment — not raw tech:
- Strict data-residency/regulatory rules, or sensitive data → hybrid (keep that subset on-prem/in-region).
- A big existing data center, or low-latency ties to on-prem systems → hybrid.
- A greenfield, globally-distributed, spiky workload with no residency constraints → all-cloud for elasticity and simplicity.
The interview cue
This comes up in enterprise, fintech, healthcare, and government designs. “Patient records stay on-prem for HIPAA/residency and low-latency access from hospital systems; the analytics and ML pipeline runs all-cloud for elastic compute, reading a de-identified copy. The two are bridged over a private link with a clear data-classification and sync policy.” Driving the split by compliance and latency, and acknowledging the integration complexity, is the mature answer.