Building Tinder
Implement geohash candidate queries, the swipe write sharded by user, O(1) match detection, and deck caching with already-swiped exclusion.
Geohash candidate query
Encode each user’s location into a geohash and index it; a nearby query scans the user’s cell plus its eight neighbors (to avoid border misses):
def nearby_candidates(user, radius_km):
cell = geohash.encode(user.lat, user.lng, precision=for_radius(radius_km))
cells = [cell] + geohash.neighbors(cell) # include adjacent cells (border fix)
candidates = []
for c in cells:
candidates += geo_index.range(c) # active users in this cell
return [u for u in candidates
if matches_prefs(user, u) and within(user, u, radius_km)]
Index updates on location change (geo_index.move(user_id, old_cell, new_cell)).
Building the deck (exclude already-swiped)
def build_deck(user, k=50):
swiped = swipes.targets_of(user.id) # set of ids already swiped
deck = [u for u in nearby_candidates(user, user.max_distance)
if u.id not in swiped][:k]
deck = rank(deck, user) # optional: desirability/interest ML
cache.set(f"deck:{user.id}", deck, ttl="15m") # instant swiping; refill in bg
return deck
Recording a swipe (sharded by swiper)
def swipe(swiper, target, direction): # direction = "right" | "left"
swipes.put(swiper, target, direction) # shard key = swiper → spreads writes
if direction == "right":
return check_match(swiper, target)
return None
Sharding by swiper means each user’s swipe history is co-located, so dedup/exclusion
lookups are single-shard, and the enormous write volume spreads evenly.
O(1) match detection
On a right swipe, check whether the target already right-swiped you — one point lookup:
def check_match(swiper, target):
if swipes.get(target, swiper) == "right": # they already liked you?
match_id = create_match(swiper, target) # store the pair, idempotent
notify(swiper, "match", target); notify(target, "match", swiper)
open_chat(match_id) # reuse the messaging system
return match_id
return None
No batch “find mutual likes” job — the match is detected the instant the second
right-swipe happens. (A likes_received index per user makes the reverse lookup direct
if the swipe store isn’t keyed for it.)
Storage
- Swipes in a wide-column/KV store sharded by swiper (write-optimized).
- Geo index sharded by region; dense cities are hotspots → finer cells there.
- Matches stored per pair; chat in the messaging system.
- Profiles/media → DB + object store + CDN (Instagram pattern).
Scale and failure handling
- Write-heavy swipes → write-optimized store, shard by swiper, async aggregate any counters (likes received).
- Hot geo cell (a festival, a city center) → finer geohash precision / dynamic cell splitting.
- Deck staleness (people move / go inactive) → short TTL + re-query; filter inactive users at serve time.
- Duplicate swipe → idempotent (one record per swiper→target).
- Match notification → reuse the notification service (push).
The takeaway
Concrete signals: geohash nearby queries (with neighbor cells for borders), swipes sharded by swiper for the write-heavy load, O(1) match detection via a reverse lookup at swipe time, and cached decks excluding already-swiped users. Geospatial recall + a clever write-time match check is the essence — and the geo index reappears in Yelp and Uber.