Skip to content

Instantly share code, notes, and snippets.

@Mic92
Created February 5, 2012 16:03
Show Gist options
  • Save Mic92/1746239 to your computer and use it in GitHub Desktop.
Save Mic92/1746239 to your computer and use it in GitHub Desktop.
scrape surfmusic
#!/usr/bin/env ruby
require 'rubygems'
require 'nokogiri'
require 'open-uri'
def scrape(country)
doc = Nokogiri::HTML(open("http://www.surfmusik.de/land/#{country}.html"))
radio_links = doc.css("td.home1 a.navil")
radio_links.each do |link|
title = "";
url="";
next if not link['href'] =~ /\/radio\//
site = Nokogiri::HTML(open(link['href']))
site.css("title").each do |title|
title = title.content
end
site.css("a").each do |stream_link|
url = stream_link['href'] if stream_link.content =~ /Externer Player Stream/
end
puts "INSERT INTO radiostations (name, url) VALUES ('#{title}', '#{url}');"
end
end
def getCountryList(urlEnding)
doc = Nokogiri::HTML(open("http://www.surfmusik.de/#{urlEnding}.htm"))
land_links = doc.css("td.home a.navil")
land_links.each do |landText|
next if not landText["href"] =~ /land\/(\w*)\.html/
land = $1
scrape(land)
end
end
getCountryList("euro")
getCountryList("afrika")
getCountryList("amerika")
getCountryList("asien")
getCountryList("ozean")
getCountryList("staaten")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment