Skip to content

Instantly share code, notes, and snippets.

@sarchertech
Created April 17, 2011 03:21
Show Gist options
  • Select an option

  • Save sarchertech/923721 to your computer and use it in GitHub Desktop.

Select an option

Save sarchertech/923721 to your computer and use it in GitHub Desktop.
Quick script I wrote to find stores with the same name, that may be franchises.
require 'yaml'
class Store
attr_accessor :name, :address, :city, :state, :zip, :phone_number
end
def list_of_stores
files = Dir.glob('*.yml')
stores = []
files.each do |file|
File.open(file, 'r') {|f| stores += YAML.load(f)}
end
return stores
end
def delete_duplicates(stores)
seen = []
marker = []
counter = 0
stores.each do |store|
attr_array = [store.name[0..5], store.zip, store.phone_number, store.address[0..3]]
if seen.include?(attr_array)
marker << store
counter += 1
else
seen << attr_array
end
end
marker.each {|m| stores.delete(m)}
return counter
end
def print_multi_stores(stores)
seen = {}
stores.each do |store|
if seen.has_key?(store.name)
seen[store.name][0] += 1
else
seen[store.name] = [1, store.state]
end
end
seen = seen.sort_by {|k,v| v[0]}
seen.reverse!
puts ""
puts "multi stores"
puts "-------------"
#sorting converts hash to array of arrays
seen.each do |k,v|
num, state = v
puts num.to_s + "--" + state + "--" + k if num > 1
end
puts "-------------"
puts ""
end
stores = list_of_stores
stores.sort_by! {|s| s.zip}
puts stores.length
puts "deleted " + delete_duplicates(stores).to_s + " duplicate stores"
print_multi_stores(stores)
@coty

coty commented Apr 20, 2011

Copy link
Copy Markdown

I think you could use 1.9's uniq! with a block to simplify your delete_duplicates method: https://gist.github.com/932961

@sarchertech

sarchertech commented Apr 21, 2011 via email

Copy link
Copy Markdown
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment