Created
June 23, 2009 21:50
-
-
Save mrflip/134866 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Three more ways to cheat: | |
# Stack up all the values in a list then sum them at once: | |
require 'active_support/core_ext/enumerable' | |
class Reducer1 < Wukong::Streamer::ListReducer | |
def finalize | |
yield [ key, values.map(&:last).map(&:to_i).sum ] | |
end | |
end | |
# | |
# ... this is common enough that it's already included | |
# | |
require 'wukong/streamer/count_keys' | |
class Reducer3 < Wukong::Streamer::CountKeys | |
end | |
# | |
# or really cheat | |
# | |
require 'wukong/streamer/count_keys' | |
class Reducer4 < Wukong::Streamer::Base | |
def stream | |
puts `uniq -c` | |
end | |
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# Accumulate the sum record-by-record: | |
# | |
class Reducer2 < Wukong::Streamer::AccumulatingReducer | |
attr_accessor :key_count | |
def start!(*args) self.key_count = 0 end | |
def accumulate(*args) self.key_count += 1 end | |
def finalize | |
yield [ key, key_count ] | |
end | |
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# Accumulate the sum record-by-record: | |
# | |
class Reducer2 < Wukong::Streamer::AccumulatingReducer | |
attr_accessor :key_count | |
def start!(*args) self.key_count = 0 end | |
def accumulate(*args) self.key_count += 1 end | |
def finalize | |
yield [ key, key_count ] | |
end | |
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
module WordCount | |
class Mapper < Wukong::Streamer::LineStreamer | |
# Split a string into its constituent words. | |
def tokenize str | |
str.downcase. | |
strip. | |
split(/\s+/). | |
reject(&:blank?) | |
end | |
# Emit each word in each line. | |
def process line | |
tokenize(line).each{|word| yield [word, 1] } | |
end | |
end | |
# Conceptually: reduce with uniq -c | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment