Skip to content

Instantly share code, notes, and snippets.

@mostlyfine
Created January 5, 2011 16:28
Show Gist options
  • Select an option

  • Save mostlyfine/766543 to your computer and use it in GitHub Desktop.

Select an option

Save mostlyfine/766543 to your computer and use it in GitHub Desktop.
require 'rubygems'
require 'open-uri'
require 'nokogiri'
KCODE = 'euc-jp'
index_url = 'http://www.dmm.co.jp/digital/videoa/-/actress/=/keyword=%s/'
%w(a ka sa ta na ha ma ya ra wa).each do |w|
content = Nokogiri::HTML.parse(open(index_url % w).read)
links = content.xpath("//div[@class='act-box']/ul/li/a")
links.each do |link|
name = link.text
url = link.attribute('href').to_s
id = url.gsub(/.*id=([0-9]+).*/, '\1')
image_path = link.xpath('img').attribute('src').to_s
puts "#{id},#{name},#{image_path}"
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment