Design (Data Structures)
Compose primitives to hit required time bounds — the hash-map+DLL, hash-map+array, two-heap, and bucket combos behind LRU, LFU, median, and getRandom.
What “design” problems ask
You implement a class with specific operations at target complexities (often O(1) or O(log n) each). The skill is picking and combining primitives so every operation meets its bound — and keeping the structures in sync on every mutation. These show up constantly (LRU/LFU cache, median stream, getRandom) and reward a clear plan over clever code.
The method
- List each operation and its required complexity.
- Map each to a primitive: random access → array; keyed lookup → hash map; ordering → heap/BST; recency/position → linked list; prefix → trie.
- Combine them, and on every mutation update all structures so they stay consistent.
The combos that recur
-
Hash map + doubly-linked list → O(1) LRU Cache. Map key→node; on access move the node to the front; evict the tail when full. The single most common design problem.
class LRUCache: def __init__(self, cap): self.cap, self.map = cap, {} self.head, self.tail = Node(), Node() # sentinels self.head.next, self.tail.prev = self.tail, self.head def get(self, key): if key not in self.map: return -1 node = self.map[key]; self._remove(node); self._add_front(node) return node.val def put(self, key, val): if key in self.map: self._remove(self.map[key]) node = Node(key, val); self.map[key] = node; self._add_front(node) if len(self.map) > self.cap: lru = self.tail.prev; self._remove(lru); del self.map[lru.key] -
Hash map + array → O(1) insert/remove/getRandom (“Insert Delete GetRandom”): array for O(1) random access, map of value→index, swap-with-last to delete in O(1).
-
Two heaps (max-heap of the low half, min-heap of the high half) → O(log n) insert, O(1) median (“Find Median from Data Stream”).
-
Hash map + bucket list of doubly-linked lists → O(1) LFU Cache (frequency buckets, evict the least-frequent then least-recent).
-
Stack(s) → O(1) Min Stack (parallel min stack); Queue via two stacks (amortized O(1)).
-
Trie → prefix-keyed lookups (Implement Trie, autocomplete; see the tries primer).
-
Heap / buckets / hash map → top-K & frequency (Design Twitter feed merge, Top-K frequent).
Iterators and streaming
“Design an iterator” / data-stream problems want lazy next()/hasNext() or running
aggregates with bounded memory:
- Flatten nested list / BST iterator — keep a stack of the current frontier, advance lazily.
- Peeking iterator — buffer one element ahead.
- Moving average / data stream — a fixed-size deque or running sum; median → two heaps.
Getting the invariants right
The bugs in design problems are desync bugs — the map and the list disagree after an edit.
Write down the invariant (“map[key] always points to the node currently in the list”) and verify
each method preserves it. Use sentinel head/tail nodes so DLL edits need no null checks.
Choosing in an interview
- O(1) recency/eviction → hash map + DLL.
- O(1) random member + delete → hash map + array (swap-with-last).
- Streaming median / order statistics → two heaps (or a balanced BST).
- Frequency/top-K → buckets / heap + map.
- Prefix queries → trie.
Lead with the op→complexity→primitive table; the implementation falls out, and stating the invariant is the senior signal.