Created
December 9, 2016 19:03
-
-
Save harshvladha/35605aa3fd891ebedc3ba41dc7393e8f to your computer and use it in GitHub Desktop.
Python MR Mapper
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3.4 | |
import sys | |
#input from STDIN | |
for line in sys.stdin: | |
#remove all trailing and leading whitespaces | |
line = line.strip() | |
#split line into words | |
words = line.split() | |
#loop for all the words in the words list | |
for word in words: | |
#output to STDOUT, the word and its count | |
#we give count = 1 to every word here | |
#this output is input to reducer | |
#trivially tab delimited! | |
print( '{0}\t{1}'. format(word, 1)) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment