-
-
Save sarchertech/923721 to your computer and use it in GitHub Desktop.
require 'yaml' | |
class Store | |
attr_accessor :name, :address, :city, :state, :zip, :phone_number | |
end | |
def list_of_stores | |
files = Dir.glob('*.yml') | |
stores = [] | |
files.each do |file| | |
File.open(file, 'r') {|f| stores += YAML.load(f)} | |
end | |
return stores | |
end | |
def delete_duplicates(stores) | |
seen = [] | |
marker = [] | |
counter = 0 | |
stores.each do |store| | |
attr_array = [store.name[0..5], store.zip, store.phone_number, store.address[0..3]] | |
if seen.include?(attr_array) | |
marker << store | |
counter += 1 | |
else | |
seen << attr_array | |
end | |
end | |
marker.each {|m| stores.delete(m)} | |
return counter | |
end | |
def print_multi_stores(stores) | |
seen = {} | |
stores.each do |store| | |
if seen.has_key?(store.name) | |
seen[store.name][0] += 1 | |
else | |
seen[store.name] = [1, store.state] | |
end | |
end | |
seen = seen.sort_by {|k,v| v[0]} | |
seen.reverse! | |
puts "" | |
puts "multi stores" | |
puts "-------------" | |
#sorting converts hash to array of arrays | |
seen.each do |k,v| | |
num, state = v | |
puts num.to_s + "--" + state + "--" + k if num > 1 | |
end | |
puts "-------------" | |
puts "" | |
end | |
stores = list_of_stores | |
stores.sort_by! {|s| s.zip} | |
puts stores.length | |
puts "deleted " + delete_duplicates(stores).to_s + " duplicate stores" | |
print_multi_stores(stores) |
I'm not sure; I guess it just comes down to what you need. I was just messing around with a different way of building up that hash and noticed it. I'm still playing with it. :-)
So, here are my mods: https://gist.github.com/923754
At first, I was focused on the block at line 45 in your code. Often in ruby, you can eliminate that whole, "if the key doesn't exist, initialize it, otherwise do something else" idiom by telling ruby, ahead of time, what to do whenever it encounters a key it hasn't seen. In this case, I'm passing in a block that I want it to run whenever it encounters a new key. That block, in turn, creates yet another hash that will initialize any new key's value to 0. So calling seen['New Store']['GA'] += 1 will automagically create something like:
{ 'New Store' => { 'GA' => 1} } without us explicitly initializing either hash.
Dave Brady actually posted a screencast with more detail on this just the other day: http://www.heartmindcode.com/blog/2011/04/creating-ruby-hashes/ (There's also a follow-up with JEG2 on Dave's site that's worth watching.)
It ended up being a bit messier than I originally intended since I decided to keep a separate count for each state, as well.
I think you could use 1.9's uniq! with a block to simplify your delete_duplicates method: https://gist.github.com/932961
You're pinting the state with the store-count but if you have a "franchise" in multiple states, you're only going to see the first state encountered. Is that what you meant to do?