Stateful vs stateless architecture

Whether a service remembers anything between requests decides how easily it scales, fails over, and load-balances — usually the case for pushing state down and out.

The distinction

Stateless — the service keeps no per-client memory between requests. Every request carries everything needed to handle it; any instance can serve any request.
Stateful — the service remembers something across requests (a session, a connection, in-memory data), so a given client’s requests must reach the instance that holds its state.

The state still exists in a stateless architecture — it just lives in a shared store (database, cache) instead of inside the service instance.

Why stateless is the default for application tiers

Trivial horizontal scaling — add/remove instances freely; the load balancer sends any request anywhere (plain round robin, no stickiness).
Fault tolerance — an instance dying loses nothing; another picks up instantly.
Simple deploys — rolling restarts and autoscaling “just work.”
Cost: every request must re-fetch or re-receive its context, and you need a fast shared store (a cache) to hold the state that used to be local.

This is why the standard pattern is: stateless app servers in front, state pushed down into databases and a shared cache. Put the user’s session in Redis, not in a server’s memory, and any server can handle them.

Where statefulness is unavoidable

Some things are inherently stateful — you don’t eliminate it, you contain it:

Databases and caches — their whole job is to hold state. You make them scalable with replication, sharding, and consistent hashing (Chapter 2).
Real-time connection servers (WebSocket/long-poll) — a live connection is state bound to one node. Scale with a pub/sub backplane so any node can deliver to any connection, and sticky routing to keep a socket on its node.
Stateful stream processing (windowed aggregations) — keep local state but checkpoint it durably so it survives restarts.

The side-by-side

	Stateless	Stateful
Per-client memory	None (in shared store)	Held in the instance
Load balancing	Any instance (round robin)	Sticky / affinity routing
Scaling	Add/remove freely	Harder; rebalance state
Instance failure	No data lost	Lose/transfer that state
Typical home	App/API tier	DB, cache, real-time, stream

The sticky-session anti-pattern

Storing sessions in server memory forces sticky sessions (the LB must pin each user to one server). It works until that server dies (sessions gone) or you need to scale down (can’t drain cleanly). The fix is almost always: externalize the session to a shared store and make the tier stateless. Recognizing and removing this coupling is a classic senior signal.

The interview cue

“I’ll keep the API tier stateless — sessions and per-user state go in Redis — so it scales horizontally behind a plain load balancer and tolerates instance death. The genuinely stateful parts (the database, and the WebSocket gateway) I isolate and scale deliberately: the DB via sharding/replication, the gateway via a pub/sub backplane.” Pushing state down/out and explicitly handling the bits that must be stateful is the design move.