Created
September 20, 2017 22:52
-
-
Save mkreyman/6d43e4ce857bac04065e84479c20df9c to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| module Enumerable | |
| def to_histogram | |
| inject(Hash.new(0)) { |h, x| h[x] += 1; h} | |
| end | |
| end | |
| class Bag | |
| def initialize | |
| @bag_of_objects = [] | |
| end | |
| def add(object) | |
| @bag_of_objects << object | |
| end | |
| def items | |
| @bag_of_objects.flatten | |
| end | |
| def count | |
| items.to_histogram | |
| end | |
| end | |
| bag = Bag.new | |
| book = ARGV[0].to_s | |
| number_of_results = ARGV[1].to_i | |
| # Based on Bible, Quran and "War and Peace" :) | |
| blacklist = %w( the and of to that in he shall unto for i his a they be is him not them 1 it with all thou thy was which | |
| my me said but ye their have will thee from as are when this out were upon by you up there hath then came had | |
| into on her come one we before your s also day an so shalt if at let go us went no even do now behold numbers | |
| saith therefore every these exodus because after our down or acts hast o make may over did what she who | |
| deuteronomy name thine proverbs among away any put thereof forth give neither take am days brought leviticus | |
| two according should whom know nor took thus bring mark word corinthians set more sent yet again judges like | |
| way mine about see own hundred spake done many saw words how thing years himself law thousand off cast given | |
| art three together than ever might those gave other seven through another would side romans first without high | |
| revelation nehemiah themselves where under year until midst keep both right none wherefore left toward five | |
| yea stood taken sight same been four cut whose end twenty being much spoken turned surely turn cometh why told | |
| laid seen full very only fall whole month ten such wilt seek fell little lay concerning lest can send though | |
| save some between therein smote morning dwelt nothing begat above esther known ecclesiastes tell departed bear | |
| thyself part while cubits long gone walk near doth timothy six tree hosea manner certain slew call having sat | |
| ground yourselves within bare hold cannot third whosoever cry began number receive kept thirty arose far | |
| whether moreover built knew till second wherein could gather build here throughout passed shew able received | |
| lo ephesians born forty lamentations fifty east find look chronicles genesis saying made say has al its whoever | |
| most its does ing re whomever whatever besides anything unless comes each ment became er ers tion tions ed de | |
| con com someone en mes too dis selves ex ness ha hud ple disbe ishment pro yours get yourself ones instead | |
| once ad ter un ly ta ar ap ma ac raf mer az ens ble ya myself per pre fol lam es ty self cept ah ra hu ger | |
| tor createspace paragraph t ll chapter bk10 bk1 bk11 bk2 kings psalms samuel jeremiah isaiah ezekiel things | |
| luke john job matthew joshua peter hebrews solomon saul daniel everything) | |
| unless File.exists?(book) | |
| raise "\n\n\tPlease enter a valid input file path!" + | |
| "\n\tUsage: ruby word_count.rb <book_filename> <number_of_results>\n\n" | |
| end | |
| File.readlines(book).each do |line| | |
| words = line.split(/\W+/).map(&:downcase) | |
| bag.add(words) | |
| end | |
| results = bag.count.delete_if { |word, _| blacklist.include?(word) || word !~ /\D/ }.sort_by { |word, count| -count } | |
| results = results.first(number_of_results) if number_of_results > 0 | |
| prefix = File.basename("#{book}", '.txt') + '_' | |
| suffix = number_of_results > 0 ? "_#{number_of_results}" : '' | |
| File.open("#{prefix}word_count#{suffix}.txt", 'w') do |f| | |
| results.each do |word, count| | |
| f << "#{word}: #{count}\n" | |
| end | |
| end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment