Skip to content
System design course
Ch.3 · Trade-offs that define a design·concept ·7 min read

Serverless vs traditional servers

Run code in ephemeral, auto-scaling functions you don't manage, or own long-running servers — trading operational burden for control, cost shape, and cold starts.


Two operational models

  • Traditional servers — you provision long-running machines (VMs/containers) that you scale, patch, and keep alive. You own the infrastructure.
  • Serverless (FaaS) — you upload functions; the platform runs them on demand, scales them automatically (including to zero), and bills per invocation. You own only the code. (AWS Lambda, Cloud Functions.)

Note “serverless” still runs on servers — you just don’t manage them.

Traditional servers

  • Wins: full control (runtime, OS, long-lived connections, special hardware); predictable performance (no cold starts, warm caches in memory); cost-effective at steady high load (a busy reserved machine beats per-request pricing); no execution-time limits; natural fit for stateful services (databases, WebSocket servers).
  • Costs: you operate it — provisioning, scaling, patching, monitoring, failover; you pay for idle capacity; scaling to spikes needs planning (autoscaling groups, warm pools).

Serverless

  • Wins: no servers to manage; scales automatically from zero to thousands of concurrent executions; pay only for actual use (great for spiky or low/uneven traffic); fast to ship; naturally event-driven (triggered by queues, uploads, HTTP).
  • Costs: cold starts (first call after idle pays init latency — bad for latency-critical paths); execution limits (time, memory, payload); statelessness (no in-memory state between calls — push state to a DB/cache, and connection pooling to databases gets awkward); vendor lock-in; and it can be more expensive at sustained high volume than a reserved server.

The side-by-side

Traditional serversServerless
You manageOS, scaling, capacityJust the code
ScalingManual / autoscalingAutomatic, to zero
BillingFor provisioned timePer invocation/duration
Cold startsNone (warm)Yes
StateCan be statefulStateless
Best atSteady load, stateful, low latencySpiky/event-driven, variable load

The cost crossover

The deciding factor is often load shape. Serverless wins for low, spiky, or unpredictable traffic (you pay nothing when idle). Traditional servers win for steady, high traffic (a constantly-busy reserved instance is cheaper than millions of per-request charges). There’s a crossover point — naming it (“below ~X steady QPS, serverless is cheaper and simpler; above it, reserved servers win”) is a strong signal.

The pragmatic hybrid

Most real systems mix them: long-running services for the core, stateful, latency-sensitive path (APIs, databases, real-time connections), and serverless for bursty, event-driven, or glue work (image thumbnailing on upload, scheduled jobs, webhook handlers, infrequent endpoints).

The interview cue

“The core API and the WebSocket layer run on autoscaled containers — steady load, stateful connections, no cold starts. Thumbnail generation and the nightly report run on serverless functions triggered by an upload event and a cron — spiky work where paying per invocation and auto-scaling beats keeping boxes warm.” Choosing by load shape and statefulness, and noting cold starts on latency paths, is the read.