Skip to content

Instantly share code, notes, and snippets.

@ssig33
Created October 7, 2010 13:21
Show Gist options
  • Save ssig33/615091 to your computer and use it in GitHub Desktop.
Save ssig33/615091 to your computer and use it in GitHub Desktop.
#coding:utf-8
require "mechanize"
NAME = ""
PASS = ""
@alice = Mechanize.new
@alice.user_agent_alias = 'Mac Safari'
@ids = []
@done = []
land = @alice.get URI.parse "http://l.hatena.ne.jp/"
login = @alice.get URI.parse "http://www.hatena.ne.jp/login?auto=0&backurl=http%3A%2F%2Fl.hatena.ne.jp%2F"
form = login.forms.first
form["name"] = NAME
form["password"] = PASS
res = @alice.submit(form)
@alice.get URI.parse "http://www.hatena.ne.jp/"
@alice.get URI.parse "http://l.hatena.ne.jp/"
@count = 1
def crawl id
@done << id
puts "process #{id} #{@count}"
@count += 1
r = @alice.get URI.parse "http://n.hatena.ne.jp/#{id}/relation"
form = r.forms.first
res = @alice.submit(form)
f = @alice.get URI.parse "http://n.hatena.ne.jp/#{id}/fr"
links = f.search("//a[(@class='username')]")
links.each do |l|
@ids << l.attributes["href"].to_s.split("/").last
end
@ids.delete NAME
@ids.uniq!
@done << id
@ids.delete id
@done.each{|d| @ids.delete d}
end
crawl "chels-axispowers35"
while true
@ids.each do |i|
begin
crawl i
rescue
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment