Skip to content

Instantly share code, notes, and snippets.

@harshvladha
Created December 9, 2016 19:03
Show Gist options
  • Save harshvladha/35605aa3fd891ebedc3ba41dc7393e8f to your computer and use it in GitHub Desktop.
Save harshvladha/35605aa3fd891ebedc3ba41dc7393e8f to your computer and use it in GitHub Desktop.
Python MR Mapper
#!/usr/bin/env python3.4
import sys
#input from STDIN
for line in sys.stdin:
#remove all trailing and leading whitespaces
line = line.strip()
#split line into words
words = line.split()
#loop for all the words in the words list
for word in words:
#output to STDOUT, the word and its count
#we give count = 1 to every word here
#this output is input to reducer
#trivially tab delimited!
print( '{0}\t{1}'. format(word, 1))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment