Skip to content

Instantly share code, notes, and snippets.

@simonreed
Created July 23, 2009 16:29
Show Gist options
  • Save simonreed/153171 to your computer and use it in GitHub Desktop.
Save simonreed/153171 to your computer and use it in GitHub Desktop.
require 'rubygems'
require 'hpricot'
require 'open-uri'
require 'yaml'
doc = Hpricot.parse(open("http://www.lastminute.com"))
destinations = doc.search("select#destination option")
puts "destinations.size #{destinations.size}"
cities = {}
destinations.each do |link|
unless link[:value].nil? or link[:value] == 'None'
cities[link.inner_text] = { :name => link.inner_text, :IATA => link[:value] }
end
end
puts "cities.size - #{cities.size}"
puts "cities.sort.first.inspect - #{cities.sort.first.inspect}"
clean_cities = []
cities.sort.each do | key,value |
clean_cities.push(value)
end
puts clean_cities.size
if false
city = cities.sort.first
puts city[0]
wiki_page = "http://en.wikipedia.org/wiki/#{city[0]}"
puts "Getting wiki page #{wiki_page}"
wiki = Hpricot.parse(open(wiki_page))
geohack_page = wiki.search('span.geo-dms').first.parent.parent[:href]
puts "Getting geohack page #{geohack_page}"
geohack = Hpricot.parse(open(geohack_page))
latitude = geohack.search('span.latitude').first.inner_text
longitude = geohack.search('span.longitude').first.inner_text
puts latitude
puts longitude
end
File.open('cities.yml', 'w') {|f| f.write(YAML::dump(clean_cities)) }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment