Skip to content

Instantly share code, notes, and snippets.

@caingougou
Created February 7, 2013 08:28
Show Gist options
  • Save caingougou/4729485 to your computer and use it in GitHub Desktop.
Save caingougou/4729485 to your computer and use it in GitHub Desktop.
fetch url in v2ex member profiles
#!/usr/bin/env ruby
require "httpclient"
require "nokogiri"
require "pp"
hc = HTTPClient.new(:agent_name => "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.4) Firefox/1.5.0.4")
(1..2000).each { |i|
puts i.to_s + ":"
res = hc.get('http://www.v2ex.com/uid/' + i.to_s, :follow_redirect => true)
doc = Nokogiri::HTML(res.body)
links = doc.xpath('//*[@id="Main"]/div[2]/div[2]/a')
links.each { |link|
puts link['href']
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment