Skip to content

Instantly share code, notes, and snippets.

@ZogStriP
Last active August 29, 2015 14:21
Show Gist options
  • Save ZogStriP/e60416ec234fea6f8189 to your computer and use it in GitHub Desktop.
Save ZogStriP/e60416ec234fea6f8189 to your computer and use it in GitHub Desktop.
Startup Weekend Hack
require "mechanize"
BASE_URL = "http://www.societe.com"
SEARCH_URL = "#{BASE_URL}/cgi-bin/liste?ape={APE}&dep={DEP}"
APE = %W{
4120A 4120B
4311Z 4312A 4312B 4313Z 4321A 4321B 4322A 4322B 4329A 4329B 4331Z 4332A 4332B 4332C 4333Z 4334Z 4339Z 4391A 4391B 4399A 4399B 4399C 4399D 4399E
}
Artisan = Struct.new(:id, :name, :address, :postal_code, :city, :country, :ape_code, :created_at)
crawler = Mechanize.new
(1..99).each do |dep|
puts "DEP: #{dep}"
APE.each do |ape|
puts "\t#{ape}: "
search_url = SEARCH_URL.gsub("{DEP}", dep.to_s).gsub("{APE}", ape)
crawler.get(search_url).search("#liste > a.linkresult").each do |link|
societe_url = BASE_URL + link[:href]
crawler.get(societe_url)
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment