Skip to content

Instantly share code, notes, and snippets.

@june29
Created May 9, 2009 11:20
Show Gist options
  • Save june29/109227 to your computer and use it in GitHub Desktop.
Save june29/109227 to your computer and use it in GitHub Desktop.
require "rubygems"
require "open-uri"
require "nokogiri"
BASE_URL = "http://albertayu773.pixnet.net/album/set/14344750"
LINK_XPATH = "//div[@class='thumbList']/ul/li/div/span/a"
IMG_XPATH = "id('imageFrame')//img"
NEXT_XPATH = "//a[@class='pageNext']"
SLEEP_TIME = 3
page = 1
loop do
html = Nokogiri::HTML(open("#{BASE_URL}/#{page}"))
links = html.xpath(LINK_XPATH)
links.each do |link|
href = link.attributes["href"]
photo_page = Nokogiri::HTML(open(href))
img = photo_page.xpath(IMG_XPATH).first
puts img.attributes["src"]
sleep SLEEP_TIME
end
if html.xpath(NEXT_XPATH).empty?
break
end
page += 1
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment