Skip to content

Instantly share code, notes, and snippets.

@nkpoid
Last active December 13, 2015 06:43
Show Gist options
  • Save nkpoid/5d11361e716f33984c50 to your computer and use it in GitHub Desktop.
Save nkpoid/5d11361e716f33984c50 to your computer and use it in GitHub Desktop.
友利奈緒を探す
require "open-uri"
require "json"
require "nokogiri"
#一気にリクエスト送るとロボット判定されて規制くらうので適当にsleepとか挟ませとくといいと思います
res = []
for i in 0..49 #Google検索の1ページ目から50ページ目まで収集
url = "https://www.google.co.jp/search?q=%E5%8F%8B%E5%88%A9%E5%A5%88%E7%B7%92+site:twitter.com&hl=ja&start=#{i*10}"
html = open url
doc = Nokogiri::HTML.parse html
doc.xpath("//cite").each do |node|
res << node.child.text
end
# sleep 1
end
users = res.collect{|url|
url =~ /https:\/\/twitter.com\/(\w{1,15})/
username = $1
next if !username
next if username =~ /hashtag|search/
username
}.compact.uniq
p users
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment