Last active
August 29, 2015 14:21
-
-
Save ZogStriP/e60416ec234fea6f8189 to your computer and use it in GitHub Desktop.
Startup Weekend Hack
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require "mechanize" | |
BASE_URL = "http://www.societe.com" | |
SEARCH_URL = "#{BASE_URL}/cgi-bin/liste?ape={APE}&dep={DEP}" | |
APE = %W{ | |
4120A 4120B | |
4311Z 4312A 4312B 4313Z 4321A 4321B 4322A 4322B 4329A 4329B 4331Z 4332A 4332B 4332C 4333Z 4334Z 4339Z 4391A 4391B 4399A 4399B 4399C 4399D 4399E | |
} | |
Artisan = Struct.new(:id, :name, :address, :postal_code, :city, :country, :ape_code, :created_at) | |
crawler = Mechanize.new | |
(1..99).each do |dep| | |
puts "DEP: #{dep}" | |
APE.each do |ape| | |
puts "\t#{ape}: " | |
search_url = SEARCH_URL.gsub("{DEP}", dep.to_s).gsub("{APE}", ape) | |
crawler.get(search_url).search("#liste > a.linkresult").each do |link| | |
societe_url = BASE_URL + link[:href] | |
crawler.get(societe_url) | |
end | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment