Skip to content

Instantly share code, notes, and snippets.

@colwilson
Created November 6, 2011 20:55
Show Gist options
  • Save colwilson/1343468 to your computer and use it in GitHub Desktop.
Save colwilson/1343468 to your computer and use it in GitHub Desktop.
Skeleton MapReduce in Python
from multiprocessing import Pool
def mapFunction(value):
# work on values
return a_tuple
def partition(tuples):
# marshalll tuples into a dictionary of lists of tuples
return mapping
def reduceFunction(mapping):
# work on the mapping
return results
if __name__ == '__main__':
pool = Pool(processes=N)
tuples = pool.map(mapFunction, your_iterable_data)
mapping = partition(tuples)
results = pool.map(reduceFunction, mapping.items())
# do something wit the results
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment