Skip to content

Instantly share code, notes, and snippets.

@mortymacs
Forked from nickwhite917/Python_Hash_Join.py
Created October 11, 2018 06:25
Show Gist options
  • Save mortymacs/469038103a48d55d235406d6e79dbd3d to your computer and use it in GitHub Desktop.
Save mortymacs/469038103a48d55d235406d6e79dbd3d to your computer and use it in GitHub Desktop.
Hash join implemented in Python.
from collections import defaultdict
def hashJoin(table1, index1, table2, index2):
h = defaultdict(list)
# hash phase
for s in table1:
h[s[index1]].append(s)
# join phase
return [(s, r) for r in table2 for s in h[r[index2]]]
table1 = [(27, "Jonah"),
(18, "Alan"),
(28, "Glory"),
(18, "Popeye"),
(28, "Alan")]
table2 = [("Jonah", "Whales"),
("Jonah", "Spiders"),
("Alan", "Ghosts"),
("Alan", "Zombies"),
("Glory", "Buffy")]
for row in hashJoin(table1, 1, table2, 0):
print(row)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment