Created
November 12, 2013 07:00
-
-
Save TravisL12/7426699 to your computer and use it in GitHub Desktop.
Used this to cut up every word in a CSV to count any common places where I made purchases. Such as a certain gas station or how many times at an ATM.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| require 'csv' | |
| class String | |
| def titleize | |
| split(/(\W)/).map(&:capitalize).join | |
| end | |
| end | |
| def separate_word(word) | |
| word.gsub(/[0-9\-\/\\\*\#()&'.]/,"").titleize.split(" ") | |
| end | |
| def sort_word_limits(word, min, length) | |
| word.select { |k,v| v >= min && k.length >= length }.sort_by { |k,v| v }.reverse | |
| end | |
| def word_size(word) | |
| word.keys.map { |n| n.length }.max | |
| end | |
| file_name = ARGV[0] | |
| inputfile = CSV.open(file_name).readlines.flatten.map { |row| separate_word(row)} | |
| min_word_length = 4 | |
| min_count = 15 | |
| word_count = {} | |
| inputfile.each do |items| | |
| items.each { |item| word_count.has_key?(item) ? word_count[item] += 1 : word_count[item] = 1 } | |
| end | |
| big_word = word_size(word_count) | |
| final_count = sort_word_limits(word_count, min_count, min_word_length) | |
| final_count.each { |k,v| print k.rjust(big_word) + " " + v.to_s; puts} | |
| ######### Example Input from CSV######## | |
| # Tremors Riverside Ca | |
| # Shell Oil 61635192007 Cabazon Ca | |
| # 2995 Iowa Ave. Stater 114Riverside Ca 3319 | |
| # ATM Withdrawal - 01/19 3060046 5797 North Victglobal Cashighland Ca 3319 | |
| # Non-Wells Fargo ATM Transaction Fee | |
| # UCR Bookstore Riverside Ca | |
| # Circle Bar Santamonica Ca | |
| # ATM Withdrawal - 01/25 Scad4057 *Corona-01 B Of A Corona Ca 3319 | |
| # 1294 Universityarco Payporiverside Ca 3319 | |
| # Non-Wells Fargo ATM Transaction Fee | |
| # Exxonmobil34 07865660 Riversid Ca | |
| # 2650 N Main St L And L Mariverside Ca 3319 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment