Game Theory
Two-player optimal play — win/lose states and minimax as memoized DP, score games, Nim and Grundy numbers, and finding the closed-form pattern.
Optimal play as a recurrence
In turn-based, perfect-information games, each state is a win or loss for the player to move. The defining rule:
A state is winning if some move leads to a losing state for the opponent; it’s losing if every move leads to a winning state for them.
That’s minimax, and because states recur it memoizes like DP:
from functools import cache
def can_win(state):
@cache
def win(s):
return any(not win(t) for t in moves(s)) # exists a move that leaves opponent losing
return win(state)
Score games (minimax with values)
When the outcome is a score, the mover maximizes their net advantage while the opponent minimizes it. A clean formulation tracks the score difference the current player can secure:
@cache
def best(i, j): # max (my score − opponent score) on nums[i..j]
if i > j: return 0
take_left = nums[i] - best(i + 1, j) # opponent then plays optimally on the rest
take_right = nums[j] - best(i, j - 1)
return max(take_left, take_right)
# first player wins iff best(0, n-1) >= 0
This solves “Stone Game,” “Predict the Winner,” “Stone Game II/III” (with a window of choices) — all interval DP underneath (see the DP primers).
Nim and Grundy numbers (the closed forms)
Many impartial games have closed-form answers — spotting one lets you skip the DP:
- Nim — with several piles, the player to move loses iff the XOR of all pile sizes is 0 (the “Nim-sum”). One-line answer.
- Sprague–Grundy theorem — any impartial game state has a Grundy number (
mexof the Grundy values of reachable states); a position is losing iff its Grundy value is 0, and a sum of independent games XORs their Grundy numbers. The general theory behind Nim. - Parity games — many stone-removal/coin games reduce to parity (who faces an even/odd count) — e.g. “Nim Game” (you lose iff n % 4 == 0).
When you recognize Nim/Grundy/parity, the answer is O(1) instead of an exponential search.
State design
Define the state as exactly what affects future play, no more — indices (i, j) for a row of
items, a bitmask of the remaining set for small n (see the bit-manipulation primer), counts, or
whose turn it is (often implicit in the score-difference formulation). A tight state is what makes
memoization feasible.
Finding the pattern
For “who wins” puzzles, compute small cases by hand (n = 0, 1, 2, 3, …) and look for a
win/lose pattern or periodicity before committing to DP — many reduce to a simple parity or modular
rule (the Nim Game’s n % 4, divisor games’ parity).
Pitfalls
- Forgetting the opponent also plays optimally — every recursive call must assume best play by whoever moves next (the sign flip in the score formulation).
- Bloated state — including irrelevant detail kills memoization; keep only what changes future options.
- Missing the closed form — grinding DP when a Nim/parity rule gives O(1).
Choosing in an interview
- “Can the first player force a win?” → win/lose minimax (memoized).
- Score/optimal-value game → minimax score-difference DP (often interval DP).
- Piles / impartial removal game → look for Nim-sum / Grundy / parity closed forms.
Try tiny cases first; if a pattern emerges it’s O(1), otherwise it’s memoized minimax.