Skip to content

Instantly share code, notes, and snippets.

View plotti's full-sized avatar

Thomas Ebermann plotti

View GitHub Profile
@plotti
plotti / gist:2972364
Created June 22, 2012 12:07
Solr results
Person 1 archaeologymag. Time per person: 7.783559.
Person 2 ArchaeologyDN. Time per person: 1.097274.
Person 3 archaeologynews. Time per person: 0.085687.
Person 4 NEAarchaeology. Time per person: 0.184254.
Person 5 CurrentArchaeo. Time per person: 0.116461.
@plotti
plotti / gist:2972340
Created June 22, 2012 12:00
Solr at search
def find_solr_at_connections
users = {}
persons.each do |person|
users[person.id] = person.username
end
values = []
i = 0
persons.each do |person|
i += 1
t1 = Time.now
@plotti
plotti / gist:2972327
Created June 22, 2012 11:57
Searchable feeds
# Indexing the text field with solr
searchable do
text :text
integer :person_id
time :published_at
text :retweeters do
retweet_ids.collect{|retweet| retweet[:id]}
end
end
@plotti
plotti / gist:2972316
Created June 22, 2012 11:56
Delayed Job helper
# Helper method that is used when creating delayed jobs.
def wait_for_jobs(jobname)
continue = true
while continue
found_pending_jobs = 0
Delayed::Job.all.each do |job|
if job.handler.include? jobname
found_pending_jobs += 1
end
if job.attempts >= 4
@plotti
plotti / gist:2972028
Created June 22, 2012 10:47
delayed Jobs
# A version that used delayed jobs to split the work among a lot of workers
# Deprecated because the bottleneck was querying the DB
# Same as find all valued connections only trying to make it faster
def find_delayed_at_connections(friend = true, follower = false, category = false)
usernames = persons.collect{|p| p.username}
persons.each do |person|
Delayed::Job.enqueue(AggregateAtConnectionsJob.new(person.id,self.id,usernames))
end
wait_for_jobs("AggregateAtConnectionsJob")
self.return_delayed_at_connections
@plotti
plotti / gist:2971911
Created June 22, 2012 10:27
Finding at-connectiosn
def find_at_connections()
values = []
usernames = persons.collect{|p| p.username}
i = 0
persons.each do |person|
t1 = Time.now
i += 1
person.feed_entries.each do |tweet|
#puts "#Analyzing tweet #{tweet.id}"
usernames.each do |tmp_user|
@plotti
plotti / gist:2916063
Created June 12, 2012 08:02
Comparing the rankings of the members
#Second step is to output the final partitions according to the ranking of the persons in their groups
seen_persons = []
final_candidates = {}
seen_projects = []
@@communities.each do |community|
project = Project.find(community)
if merged[project.name] == nil
project_members = members[project.name]
project_name = project.name
else
@plotti
plotti / gist:2911079
Created June 11, 2012 16:29
Merged Categories
airlines_aviation
army_military_veteran
astronomy_physics
beauty_fashion_shopping
career_employment
charity_philanthropy
etsy_handmade
exercise_fitness
finance_economics
healthcare_medicine
@plotti
plotti / gist:2910991
Created June 11, 2012 16:22
Smart interest groups matching
#Define how many list places should be considered
MAX = 200
#Threshold: The threshold until which the categories should be merged (e.g. 0.2 = 20 % of members are shared)
THRESHOLD = 0.2
outfile = CSV.open("data/partitions#{MAX}_#{THRESHOLD}.csv", "wb")
final_partition = CSV.open("data/final_partitions#{MAX}_#{THRESHOLD}.csv", "wb")
outfile << ["Name","Original Category", "Original Category Place", "Assigned Category", "Assigned Category Place", "Competing Categories", "Details"]
@plotti
plotti / gist:2910941
Created June 11, 2012 16:12
List collected for the keyword hollywood
Username,Followers,List Count,URI
aplusk,9777167,83,http://www.twitter.com/aplusk
katyperry,16107303,80,http://www.twitter.com/katyperry
ladygaga,20799180,79,http://www.twitter.com/ladygaga
TheEllenShow,9922576,75,http://www.twitter.com/TheEllenShow
RyanSeacrest,6176522,73,http://www.twitter.com/RyanSeacrest
tomhanks,3587554,73,http://www.twitter.com/tomhanks
ActuallyNPH,2908961,68,http://www.twitter.com/ActuallyNPH
ConanOBrien,5213126,63,http://www.twitter.com/ConanOBrien
Oprah,10431484,62,http://www.twitter.com/Oprah