Skip to content

Instantly share code, notes, and snippets.

@mkreyman
Created September 20, 2017 22:52
Show Gist options
  • Select an option

  • Save mkreyman/6d43e4ce857bac04065e84479c20df9c to your computer and use it in GitHub Desktop.

Select an option

Save mkreyman/6d43e4ce857bac04065e84479c20df9c to your computer and use it in GitHub Desktop.
module Enumerable
def to_histogram
inject(Hash.new(0)) { |h, x| h[x] += 1; h}
end
end
class Bag
def initialize
@bag_of_objects = []
end
def add(object)
@bag_of_objects << object
end
def items
@bag_of_objects.flatten
end
def count
items.to_histogram
end
end
bag = Bag.new
book = ARGV[0].to_s
number_of_results = ARGV[1].to_i
# Based on Bible, Quran and "War and Peace" :)
blacklist = %w( the and of to that in he shall unto for i his a they be is him not them 1 it with all thou thy was which
my me said but ye their have will thee from as are when this out were upon by you up there hath then came had
into on her come one we before your s also day an so shalt if at let go us went no even do now behold numbers
saith therefore every these exodus because after our down or acts hast o make may over did what she who
deuteronomy name thine proverbs among away any put thereof forth give neither take am days brought leviticus
two according should whom know nor took thus bring mark word corinthians set more sent yet again judges like
way mine about see own hundred spake done many saw words how thing years himself law thousand off cast given
art three together than ever might those gave other seven through another would side romans first without high
revelation nehemiah themselves where under year until midst keep both right none wherefore left toward five
yea stood taken sight same been four cut whose end twenty being much spoken turned surely turn cometh why told
laid seen full very only fall whole month ten such wilt seek fell little lay concerning lest can send though
save some between therein smote morning dwelt nothing begat above esther known ecclesiastes tell departed bear
thyself part while cubits long gone walk near doth timothy six tree hosea manner certain slew call having sat
ground yourselves within bare hold cannot third whosoever cry began number receive kept thirty arose far
whether moreover built knew till second wherein could gather build here throughout passed shew able received
lo ephesians born forty lamentations fifty east find look chronicles genesis saying made say has al its whoever
most its does ing re whomever whatever besides anything unless comes each ment became er ers tion tions ed de
con com someone en mes too dis selves ex ness ha hud ple disbe ishment pro yours get yourself ones instead
once ad ter un ly ta ar ap ma ac raf mer az ens ble ya myself per pre fol lam es ty self cept ah ra hu ger
tor createspace paragraph t ll chapter bk10 bk1 bk11 bk2 kings psalms samuel jeremiah isaiah ezekiel things
luke john job matthew joshua peter hebrews solomon saul daniel everything)
unless File.exists?(book)
raise "\n\n\tPlease enter a valid input file path!" +
"\n\tUsage: ruby word_count.rb <book_filename> <number_of_results>\n\n"
end
File.readlines(book).each do |line|
words = line.split(/\W+/).map(&:downcase)
bag.add(words)
end
results = bag.count.delete_if { |word, _| blacklist.include?(word) || word !~ /\D/ }.sort_by { |word, count| -count }
results = results.first(number_of_results) if number_of_results > 0
prefix = File.basename("#{book}", '.txt') + '_'
suffix = number_of_results > 0 ? "_#{number_of_results}" : ''
File.open("#{prefix}word_count#{suffix}.txt", 'w') do |f|
results.each do |word, count|
f << "#{word}: #{count}\n"
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment