Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save igorwwwwwwwwwwwwwwwwwwww/2e40203892b17812d5d7024a04870eea to your computer and use it in GitHub Desktop.
Save igorwwwwwwwwwwwwwwwwwwww/2e40203892b17812d5d7024a04870eea to your computer and use it in GitHub Desktop.
Hadoop Streaming API for Ruby
def emit(key, value, sep="\t")
STDOUT.puts('' << key << sep << value)
end
def map(*options)
options = [:split, "\t", 2] if options.empty?
STDIN.each_line do |line|
line.strip!
key, value = line.send(*options)
yield key, value
end
end
def reduce(*options)
options = [:split, "\t", 2] if options.empty?
current_key = nil
current_values = []
STDIN.each_line do |line|
line.strip!
key, value = line.send(*options)
if current_key != key && current_key != nil
yield current_key, current_values
current_values.clear
end
current_key = key
current_values << value
end
yield current_key, current_values unless current_key.nil?
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment