Skip to content
System design course
Ch.4 · Designing real systems·concept ·9 min read

Designing a payment system

Move money correctly — idempotency so nobody is double-charged, a double-entry ledger as the source of truth, PSP integration, and reconciliation.


The problem

Design a payment system (like Stripe’s core, or a marketplace’s payments): accept a payment, move money via banks/card networks, and record it correctly — never double-charging, never losing a transaction, always reconciling to the penny. Correctness is everything; this problem is about consistency, idempotency, and auditability, not raw scale.

Step 1 — Requirements

Functional: charge a customer (card/bank); track payment status; refunds; payouts to sellers; handle async results from banks; reporting.

Non-functional: exactly-once money movement (no double-charge, no lost payment), strong consistency + durability (CP — never lose committed money), auditability (every cent traceable), security/PCI compliance, high availability.

Step 2 — Idempotency (the most important property)

Networks retry, users double-click, services crash mid-request. Without protection, a retried “charge $100” could charge twice. Every payment operation carries a client-supplied idempotency key; the system records it and returns the original result on any retry:

charge(idempotency_key, amount, ...) →
  if key seen → return the SAME stored result (no second charge)
  else → process once, store (key → result)

This is the #1 thing to say. Idempotency keys + a stored result table make money operations safe to retry.

Step 3 — The double-entry ledger (source of truth)

Money is recorded in a double-entry ledger: every transaction is balanced debits and credits across accounts, and the books must always sum to zero. This is the accounting- grade source of truth — immutable, append-only, auditable.

charge $100: DEBIT customer_funds 100, CREDIT merchant_payable 100   (sums to 0)
  • Append-only — you never mutate a balance; you append entries. Balance = sum of entries (or a maintained materialized balance).
  • Atomic — both sides of an entry commit together (a transaction), so the books can never be half-updated.
  • Auditable — every movement is a permanent record you can reconcile.

Step 4 — Authorize vs capture; async results

Card payments are two-phase: authorize (hold funds) then capture (take them) — and results from banks/networks are often asynchronous (webhooks/callbacks). Model a payment as a state machine: created → authorized → captured → settled (or failed/refunded), advanced by events idempotently. Don’t assume synchronous success.

Step 5 — PSP / network integration

Integrate payment service providers / card networks via adapters (Strategy), each wrapped with retries, idempotency, and webhooks for async confirmation. The external world is unreliable, so the internal ledger state is your truth and you reconcile against the provider.

Step 6 — Reconciliation

Continuously compare your ledger against the bank/PSP’s records (settlement files, statements) to catch discrepancies (missing, duplicated, or mismatched amounts) and alert/repair. This is how real payment systems stay correct despite flaky externals — mention it; it’s a strong signal.

Step 7 — Architecture

client → payment API (idempotency key) → payment state machine
       → ledger (double-entry, transactional) ⇄ PSP/bank adapters (async webhooks)
       → reconciliation (ledger vs settlement) ; payouts ; reporting

Trade-offs to raise

  • Strong consistency (CP) over availability — for money, refuse rather than risk a wrong/duplicate charge.
  • Sync vs async — bank results are async; design for eventual confirmation + state machine, not blocking.
  • Exactly-once via idempotency + ledger vs naive at-least-once (double-charge risk).

The interview cue

“Every operation carries an idempotency key (retry-safe — no double charge); money lives in an append-only double-entry ledger (atomic balanced entries, the auditable source of truth); a payment is a state machine (authorize → capture → settle) advanced by async bank/PSP events; and we reconcile the ledger against settlement files. It’s CP — correctness over availability.” Idempotency + double-entry ledger + reconciliation is the defining answer; implementation next.