-
-
Save mowat27/6502220 to your computer and use it in GitHub Desktop.
Made the relationship between words cleaner and frequencies more cohesive. I would look at the naming of WordsCleaner now but I don't want to mess with the code too much and fail to demonstrate the core point - perhaps Tokeniser would be more descriptive though
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class Phrase | |
def initialize(words) | |
words_cleaner = WordsCleaner.new(words) | |
@frequency_counter = Frequencies.new(words_cleaner) | |
end | |
def word_count | |
@frequency_counter.frequencies | |
end | |
end | |
class WordsCleaner | |
def initialize(words) | |
@words = words | |
end | |
def clean | |
alpha_numeric_only(@words).downcase.split | |
end | |
private | |
def alpha_numeric_only words | |
words.gsub(/[^a-z0-9]/i, ' ') | |
end | |
end | |
class Frequencies | |
def initialize(words_cleaner) | |
@words_cleaner = words_cleaner | |
end | |
def frequencies | |
tokens = @words_cleaner.clean | |
tokens.group_by {|e| e }.each_with_object({}) do |(word, occurs), result| | |
result[word] = occurs.length | |
end | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Yeah, I see the point now. Having what was in lines 8-9 in 33 keeps the data inside the Frequencies class.
I like that it drops the method args needed too.
Thanks!