API gateway vs direct service exposure
Route every client call through one managed front door, or let clients hit services directly — centralization and control versus latency and simplicity.
The question
With many backend services, how do clients reach them? Two stances:
- API gateway — all client traffic flows through one managed entry point that routes to services.
- Direct exposure — clients call each service directly (each has its own public endpoint).
API gateway — centralize
One front door handles cross-cutting concerns once, so services stay focused.
- Wins: centralized auth, rate limiting, logging, and TLS; clients see one stable surface while you refactor services behind it; request aggregation (one client call → several service calls combined); hides internal topology.
- Costs: an extra network hop (latency); a potential bottleneck and single point of failure (must be scaled and made redundant); one more thing to operate; risk of a “god gateway” accumulating business logic.
Direct exposure — decentralize
Clients talk to services straight, no intermediary.
- Wins: lower latency (no extra hop); fewer moving parts; no shared bottleneck; each service evolves and scales independently.
- Costs: every service must implement its own auth, rate limiting, TLS, and logging (duplication and drift); clients are coupled to internal structure and many endpoints; larger attack surface; cross-cutting policy is hard to enforce uniformly.
The trade-off in one line
Centralized control and a clean client surface (gateway) vs lower latency and simplicity (direct). As the number of services and clients grows, the duplication and coupling of direct exposure usually outweighs the saved hop — which is why most microservice systems adopt a gateway.
Nuance: internal vs external
A common hybrid: a gateway for external/public clients (you need the auth, quotas, and stable surface), but direct service-to-service calls internally (often via service mesh / service discovery) where the extra hop isn’t worth it and services already trust each other.
The interview cue
“Public clients go through an API gateway for centralized auth and rate limiting and a stable contract; internally, services call each other directly through the mesh to avoid the extra hop. The gateway is a SPOF, so I’d run it redundantly behind a load balancer.” Showing you’d centralize at the edge but not internally — and that you see the gateway’s bottleneck risk — is the senior read.