Skip to content
System design course
Ch.4 · Designing real systems·concept ·8 min read

Designing an authentication service

Build sign-up, login, and session management for millions of users — password storage, tokens vs server sessions, and how every other service verifies identity.


The problem

Design an authentication & session service: register users, log them in, and let every other service verify “who is this request from?” — securely, at scale, and with a good story for logout, expiry, and compromise. This underpins essentially every system; getting the token model and password storage right is the focus.

Step 1 — Requirements

Functional: sign up, log in/out, verify a request’s identity, refresh sessions, password reset, and ideally OAuth/social login and MFA.

Non-functional: security first (no plaintext passwords, resistance to breaches and token theft), low latency (verification is on every request), high availability (auth down = everything down), and horizontal scale.

Step 2 — Storing passwords (get this right)

Never store passwords reversibly. Hash with a slow, salted, adaptive functionbcrypt, scrypt, or Argon2 — with a per-user salt (so identical passwords hash differently and rainbow tables fail) and a tuned work factor (so brute force is expensive). State this explicitly; it’s a top signal.

store: user_id, email, password_hash = argon2(password + salt), salt

Add login rate limiting and lockout/CAPTCHA to throttle credential stuffing.

Step 3 — Sessions: the central trade-off

After login, the client gets a credential to present on later requests. Two models:

  • Server-side sessions — store a session record server-side (in Redis), give the client an opaque session id cookie. Pro: easy to revoke (delete the record), small cookie. Con: a session store lookup on every request (stateful).
  • Stateless tokens (JWT) — a signed token containing claims (user id, expiry, roles). The server verifies the signature without a lookup. Pro: no per-request store hit, scales statelessly. Con: hard to revoke before expiry, and bigger.

The standard resolution: short-lived access token (JWT, ~15 min) + long-lived refresh token (opaque, stored server-side so it can be revoked). You get stateless verification for most calls plus a revocation point.

Step 4 — The flows

register → hash password → store user
login    → verify hash → issue access token (JWT) + refresh token (httpOnly cookie)
request  → service verifies JWT signature locally (no DB hit)
expiry   → client uses refresh token → auth issues a new access token
logout   → revoke the refresh token (delete server-side); access token expires soon

Step 5 — How other services verify

Services validate the JWT signature using the auth service’s public key (asymmetric signing) — so they verify locally without calling auth on every request. Often the API gateway authenticates once at the edge and forwards a trusted identity inward.

Step 6 — Beyond passwords

  • OAuth 2.0 / OIDC — “log in with Google”: you redirect to the provider, get an authorization code, exchange it for tokens — never handling their password.
  • MFA — a second factor (TOTP app, SMS code, passkey) after the password.
  • Password reset — email a single-use, expiring, signed token; never the password.

Trade-offs to raise

  • JWT (stateless, scalable, hard to revoke) vs sessions (revocable, stateful) — the access+refresh split balances both.
  • Token lifetime — short access tokens limit theft damage but force more refreshes.
  • Security vs UX — MFA and short sessions add friction; tune to risk.

The interview cue

“Passwords hashed with Argon2/bcrypt + per-user salt; login issues a short JWT access token + revocable refresh token; services verify the JWT signature locally (public key) so auth isn’t on every hot path; logout revokes the refresh token; plus OAuth, MFA, and rate-limited login.” Password hashing + the access/refresh token model + local verification is the core; implementation next.