Skip to content
All reading material
Foundations·18 min read

Graphs

Modeling, DFS/BFS, components, cycle detection (directed and undirected), topological sort, bipartite checks, and shortest paths in unweighted graphs.


Modeling

A graph is nodes + edges. Pick a representation:

  • Adjacency listdict[node] → [neighbors]. Best for sparse graphs (most problems); O(V + E) space.
  • Adjacency matrixM[u][v] = 1/weight. O(V²) space; good for dense graphs or O(1) edge lookups.
  • Edge list[(u, v, w)]. Used by Kruskal’s MST and Bellman–Ford.
  • Implicit graphs — a grid (cells connect to 4/8 neighbors — “islands”) or a state space (each configuration a node, each move an edge — puzzles, word ladder). Recognizing an implicit graph is half the battle.

Also clarify: directed vs undirected, weighted vs unweighted, cyclic vs acyclic — each changes the algorithm.

DFS and BFS — the workhorses

Both visit every reachable node once in O(V + E); a visited set prevents revisiting (infinite loops on cycles):

def num_islands(grid):
    rows, cols, seen, count = len(grid), len(grid[0]), set(), 0
    def dfs(r, c):
        if not (0 <= r < rows and 0 <= c < cols) or grid[r][c] == "0" or (r, c) in seen:
            return
        seen.add((r, c))
        for dr, dc in ((1,0),(-1,0),(0,1),(0,-1)):
            dfs(r + dr, c + dc)
    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == "1" and (r, c) not in seen:
                dfs(r, c); count += 1
    return count
  • DFS (recursion or explicit stack) — connectivity, components, cycle detection, topological sort, path existence, flood fill.
  • BFS (queue) — shortest path in an unweighted graph (fewest edges), level-by-level exploration.

Connected components

Run DFS/BFS from every unvisited node; each launch is one component. Counting launches = number of components (“number of islands,” “number of provinces”). For dynamic connectivity, use union-find (its own primer).

Cycle detection (directed vs undirected differ)

  • Undirected — during DFS, a visited neighbor that isn’t the parent means a cycle. (Or use union-find: an edge joining two already-connected nodes closes a cycle.)
  • Directed — track nodes on the current recursion stack (3-color: white/gray/black); an edge to a gray (in-progress) node is a back edge → cycle. Essential for “Course Schedule.”

Topological sort (DAGs)

Order nodes so every edge points forward — for dependency/ordering problems:

  • Kahn’s (BFS) — repeatedly remove nodes with in-degree 0, decrementing neighbors. If you can’t remove all nodes, there’s a cycle (no valid order).

    def topo_order(graph, indeg):
        q = deque([n for n in graph if indeg[n] == 0])
        order = []
        while q:
            n = q.popleft(); order.append(n)
            for m in graph[n]:
                indeg[m] -= 1
                if indeg[m] == 0: q.append(m)
        return order if len(order) == len(graph) else []   # [] → cycle
  • DFS-based — post-order, then reverse.

Bipartite check (2-coloring)

Color nodes alternately via BFS/DFS; if an edge connects two same-colored nodes, it’s not bipartite. Models “can these be split into two groups” / “is this graph 2-colorable.”

Shortest paths in unweighted graphs

  • BFS from the source gives fewest-edges distance.
  • Multi-source BFS — seed the queue with all sources at distance 0 (“rotting oranges,” nearest-exit).
  • 0-1 BFS — edges of weight 0/1: use a deque (push 0-weight to front, 1-weight to back) for O(V + E).
  • Bidirectional BFS — search from both ends and meet in the middle (word ladder) to cut the explored frontier.

For weighted shortest paths (Dijkstra, Bellman–Ford, Floyd–Warshall) and MST (Kruskal/Prim), see the advanced-graphs primer.

Common pitfalls

  • Marking visited too late — mark on enqueue (BFS) or entry (DFS) to avoid adding a node twice.
  • Undirected edges — add both directions to the adjacency list.
  • Recursion depth on large/deep graphs → iterative DFS or BFS.
  • Disconnected graphs — loop over all nodes as start points, don’t assume one component.

Choosing in an interview

  • Reachability / components / flood fill → DFS or BFS.
  • Fewest steps (unweighted) → BFS (multi-source / bidirectional / 0-1 as needed).
  • Ordering with dependencies → topological sort (+ cycle check).
  • Two-group / 2-color feasibility → bipartite check.
  • Dynamic “are these connected” → union-find.
  • Weighted shortest path / MST → advanced graphs.

State whether the graph is directed/weighted up front — it picks the algorithm for you.