Building UPI (Unified Payments Interface)
Implement the switch's debit-then-credit saga with a unique reference, idempotent bank legs, and timeout-driven reversal.
VPA resolution
A payment starts by resolving the payee’s virtual address to a bank/account (the account number is never exposed to the payer):
def resolve(vpa):
rec = vpa_directory.get(vpa) # alice@bank → {bank, account_token}
if not rec: raise InvalidVPA()
return rec
The switch saga: debit, then credit, reverse on failure
The switch coordinates the two banks with a single transaction reference and durable state, so it can resume/resolve after any timeout:
def pay(payer_vpa, payee_vpa, amount, idempotency_key):
ref = txn_log.create(ref=uuid(), payer=payer_vpa, payee=payee_vpa,
amount=amount, state="initiated", key=idempotency_key)
payee = resolve(payee_vpa); payer = resolve(payer_vpa)
debit = bank_call(payer.bank, "DEBIT", ref, payer.account, amount) # leg 1
if debit.status != "SUCCESS":
txn_log.set(ref, "failed"); return FAILED
txn_log.set(ref, "debited")
credit = bank_call(payee.bank, "CREDIT", ref, payee.account, amount) # leg 2
if credit.status == "SUCCESS":
txn_log.set(ref, "success"); return SUCCESS
else:
reverse_debit(ref, payer, amount) # compensate: refund the payer
txn_log.set(ref, "reversed"); return FAILED
The durable txn_log (keyed by ref) is what lets the switch recover: after a crash or
timeout it knows whether the debit happened and what to do.
Idempotent bank legs
Each bank dedups on the ref so retries (frequent, due to timeouts) never double-move:
def bank_handle(op, ref, account, amount): # inside a bank
if ledger.applied(ref, op): # already did this leg?
return ledger.result(ref, op) # return the same outcome
with txn():
if op == "DEBIT":
ok = conditional_debit(account, amount) # no overdraft (wallet pattern)
result = "SUCCESS" if ok else "INSUFFICIENT"
else: # CREDIT
credit(account, amount); result = "SUCCESS"
post_double_entry(ref, op, account, amount) # the bank's own ledger
ledger.mark_applied(ref, op, result)
return result
Timeout-driven reversal
The dangerous case: the switch debits the payer but doesn’t hear back about the credit (timeout). It must resolve definitively — query status, then credit or reverse — never leave money missing:
def resolve_pending(): # background reconciler
for tx in txn_log.stuck(state="debited", older_than="30s"):
status = bank_call(payee_bank(tx), "STATUS", tx.ref)
if status == "SUCCESS": txn_log.set(tx.ref, "success")
else:
reverse_debit(tx.ref, payer(tx), tx.amount) # idempotent refund
txn_log.set(tx.ref, "reversed")
reverse_debit itself carries the ref and is idempotent, so re-running the reconciler is
safe.
Reconciliation
The switch and each bank periodically compare transaction logs by ref; any mismatch
(debited-but-not-credited, duplicate) is flagged and repaired (credit the missing leg or
reverse). This is the ultimate correctness guarantee across independent institutions.
Scale and failure handling
- Switch → stateless coordination + durable txn log, horizontally scaled, multi-region, HA (it’s the critical path for the whole network).
- Retries/timeouts → idempotency by
refeverywhere; reconciler resolves stuck txns. - Debit ok, credit fails → automatic reversal (payer refunded).
- Bank down → leg fails/timeouts → reverse or retry; reconcile.
- CP → never declare success until both legs confirmed; reverse on doubt.
The takeaway
Concrete signals: a switch-coordinated debit→credit saga keyed by a unique reference, idempotent bank legs (dedup on ref, no double-move), timeout-driven reversal, and reconciliation across institutions. It’s the digital-wallet ledger extended to a cross-bank distributed transaction — saga + idempotency + reversal is how money moves correctly between systems you don’t control.