Skip to content

Instantly share code, notes, and snippets.

@RX14
Created February 14, 2015 14:49
Show Gist options
  • Save RX14/929ea1016152ddda4b43 to your computer and use it in GitHub Desktop.
Save RX14/929ea1016152ddda4b43 to your computer and use it in GitHub Desktop.
require "bundler"
require "open-uri"
require "json"
require "benchmark"
Bundler.require
def parse_link_xml(link_xml)
link = {}
link[:type] = link_xml.text
link[:link] = link_xml["href"]
link
end
def parse_resolution_xml(res_xml)
res = {}
res[:resolution] = res_xml.at_css('a[href="#"] > text()').text
res[:filename] = res_xml.at_css("span.dl-label").text
res[:links] = res_xml.css("span.ind-link > a").map(&method(:parse_link_xml))
res
end
def parse_episode(episode_xml)
episode = {}
match = /^\(.+?\) (.*) - (\d*)$/.match(episode_xml.at_css("text()").text)
episode[:show] = match[1]
episode[:ep_num] = match[2].to_i
episode[:id] = episode_xml["id"]
episode[:resolutions] = episode_xml.css("div.resolution-block.linkful").map(&method(:parse_resolution_xml))
episode
end
doc = Nokogiri::HTML(open("http://horriblesubs.info/lib/latest.php"))
episodes = doc.css("div.episode").map(&method(:parse_episode))
puts JSON.pretty_generate(episodes)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment