Skip to content
System design course
Ch.3 · Trade-offs that define a design·concept ·6 min read

Hybrid-cloud vs all-cloud storage

Keep some data on-premises while using the cloud, or go fully cloud — a trade between control, compliance, and latency versus elasticity and lower operational burden.


The question

Where does your data physically live?

  • All-cloud — everything in a cloud provider (S3, cloud databases). You rent storage and offload operations.
  • Hybrid — a mix: some data on on-premises (or private) infrastructure you own, some in the public cloud, connected into one system.

All-cloud

  • Wins: elasticity — effectively unlimited capacity on demand, pay for what you use; low operational burden — the provider handles hardware, durability, replication, backups; built-in global distribution and managed services; fast to start, no capital outlay.
  • Costs: less control over exact location and hardware; egress fees and long-term cost can surprise at scale; vendor lock-in; compliance/data residency constraints may forbid certain data leaving a jurisdiction; reliance on the provider’s availability.

Hybrid

  • Wins: control and compliance — keep regulated or sensitive data (financial, health, government) on infrastructure you own or in a specific jurisdiction; lower latency for systems that must sit near on-prem apps or factory/edge hardware; reuse existing data-center investment; avoid full lock-in; cloud-burst for spikes while keeping a steady baseline on-prem.
  • Costs: complexity — two environments to integrate, secure, and keep consistent; networking between them (bandwidth, latency, VPN/direct-connect); you still operate the on-prem half; data synchronization and a coherent security model are genuinely hard.

The side-by-side

All-cloudHybrid
ElasticityHighestCloud half only
Ops burdenLowestYou run the on-prem half
Control / residencyProvider-boundStrong (keep data where you must)
Latency to on-prem appsHigherLow (data stays local)
ComplexityLowerHigher (two worlds)
Lock-inHigherLower

What usually drives the choice

It’s most often compliance, latency, and existing investment — not raw tech:

  • Strict data-residency/regulatory rules, or sensitive data → hybrid (keep that subset on-prem/in-region).
  • A big existing data center, or low-latency ties to on-prem systems → hybrid.
  • A greenfield, globally-distributed, spiky workload with no residency constraints → all-cloud for elasticity and simplicity.

The interview cue

This comes up in enterprise, fintech, healthcare, and government designs. “Patient records stay on-prem for HIPAA/residency and low-latency access from hospital systems; the analytics and ML pipeline runs all-cloud for elastic compute, reading a de-identified copy. The two are bridged over a private link with a clear data-classification and sync policy.” Driving the split by compliance and latency, and acknowledging the integration complexity, is the mature answer.