Skip to content

Instantly share code, notes, and snippets.

@nickwhite917
Created February 28, 2017 16:08
Show Gist options
  • Save nickwhite917/a158a2c82d5569ce46c6a52d1aeffd55 to your computer and use it in GitHub Desktop.
Save nickwhite917/a158a2c82d5569ce46c6a52d1aeffd55 to your computer and use it in GitHub Desktop.
Hash join implemented in Python.
from collections import defaultdict
def hashJoin(table1, index1, table2, index2):
h = defaultdict(list)
# hash phase
for s in table1:
h[s[index1]].append(s)
# join phase
return [(s, r) for r in table2 for s in h[r[index2]]]
table1 = [(27, "Jonah"),
(18, "Alan"),
(28, "Glory"),
(18, "Popeye"),
(28, "Alan")]
table2 = [("Jonah", "Whales"),
("Jonah", "Spiders"),
("Alan", "Ghosts"),
("Alan", "Zombies"),
("Glory", "Buffy")]
for row in hashJoin(table1, 1, table2, 0):
print(row)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment