Created
May 17, 2011 13:22
-
-
Save codeincontext/976453 to your computer and use it in GitHub Desktop.
Dodgy script to fetch twitter IDs from OAuth tokens using threads
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# script/runner lib/get_twitter_user_ids.rb | |
class ErrorFourOhOne < Exception; end | |
class ThreadWorker < Thread | |
def initialize(twitter_auths, ids) | |
@ids = ids | |
@twitter_auths = twitter_auths | |
super do | |
get_id_for(@twitter_auths.pop) while twitter_auths.present? | |
end | |
end | |
def get_id_for(twitter_auth) | |
return if twitter_auth.state != Twitter::STATE_ACCESS | |
response = twitter_auth.access_token.get("/account/verify_credentials.json") | |
raise ErrorFourOhOne if response.code.to_i == 401 | |
raise "VerifyResponse#{response.code.to_i}" unless response.code.to_i == 200 | |
result = JSON.parse(response.body) | |
@ids[twitter_auth] = result['id'] | |
rescue ErrorFourOhOne | |
@ids[twitter_auth] = nil | |
rescue Exception=>e | |
raise if e.is_a? Interrupt | |
# Making the (slightly ropey) assumption that we have hit some sort of rate limit, or timed out. Try it again | |
@twitter_auths << twitter_auth | |
puts "Sleeping on #{twitter_auth.id}" | |
sleep(1) | |
end | |
end | |
twitter_auths = Twitter::Auth.all(:conditions => 'twitter_id IS NULL AND state = "access"') | |
threads = [] | |
ids = {} | |
50.times do | |
threads << ThreadWorker.new(twitter_auths, ids) | |
end | |
while twitter_auths.count > 0 do | |
ids.each_pair do |k, v| | |
if v | |
k.update_attribute(:twitter_id, v) | |
puts "Twitter ID Added: #{v}" | |
else | |
puts "Fail: #{k.id}" | |
end | |
ids.delete(k) | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I only needed to run this once, when migrating new functionality, so performance/reliability wasn't really a consideration. Worked off ~40,000 records in about 5 minutes.