Skip to content

Instantly share code, notes, and snippets.

View wflanagan's full-sized avatar

William Flanagan wflanagan

View GitHub Profile
@wflanagan
wflanagan / red_ex.rb
Created September 28, 2011 13:12
EM-based crawler
require 'simple_worker'
require 'eventmachine'
require 'em-http-request'
require 'nokogiri'
require 'aws'
require 'redis'
class RedEx < SimpleWorker::Base
merge_gem 'em-redis'
rifice child
[1003950.944011] Killed process 1157 (bundle) total-vm:19344kB, anon-rss:4756kB, file-rss:0kB
[1003965.552027] Out of memory: Kill process 1223 (bundle) score 4 or sacrifice child
[1003965.552027] Killed process 1223 (bundle) total-vm:19372kB, anon-rss:5024kB, file-rss:40kB
[1004036.584031] Out of memory: Kill process 1245 (bundle) score 4 or sacrifice child
[1004036.584031] Killed process 1245 (bundle) total-vm:19292kB, anon-rss:5168kB, file-rss:44kB
[1004057.621714] Out of memory: Kill process 1410 (bundle) score 4 or sacrifice child
[1004057.621739] Killed process 1410 (bundle) total-vm:19344kB, anon-rss:4944kB, file-rss:0kB
[1004111.392018] Out of memory: Kill process 1450 (bundle) score 4 or sacrifice child
[1004111.392018] Killed process 1450 (bundle) total-vm:19320kB, anon-rss:4844kB, file-rss:184kB
/Users/wflanagan/sites/marketfu/config/initializers/typhoeus_response_document_patch.rb:92: [BUG] Segmentation fault
ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-darwin11.2.0]
-- control frame ----------
c:0047 p:0013 s:0183 b:0182 l:000060 d:000181 BLOCK /Users/wflanagan/sites/marketfu/config/initializers/typhoeus_response_document_patch.rb:92
c:0046 p:0015 s:0172 b:0172 l:000162 d:000171 BLOCK /Users/wflanagan/.rvm/gems/ruby-1.9.2-p290/gems/nokogiri-1.4.6/lib/nokogiri/xml/node_set.rb:239
c:0045 p:---- s:0169 b:0169 l:000168 d:000168 FINISH
c:0044 p:---- s:0167 b:0167 l:000166 d:000166 CFUNC :upto
c:0043 p:0023 s:0163 b:0163 l:000162 d:000162 METHOD /Users/wflanagan/.rvm/gems/ruby-1.9.2-p290/gems/nokogiri-1.4.6/lib/nokogiri/xml/node_set.rb:238
c:0042 p:0129 s:0159 b:0159 l:000060 d:000060 METHOD /Users/wflanagan/sites/marketfu/config/initializers/typhoeus_response_document_patch.rb:90
@wflanagan
wflanagan / gist:1442731
Created December 7, 2011 13:06
complete links function
def complete_links(opts = {})
return @complete_links unless @complete_links.blank?
link_list = if opts[:limit]
links.slice(0..opts[:limit])
else
links
end
@complete_links = []
@wflanagan
wflanagan / gist:1442732
Created December 7, 2011 13:06
complete links function
def complete_links(opts = {})
return @complete_links unless @complete_links.blank?
link_list = if opts[:limit]
links.slice(0..opts[:limit])
else
links
end
@complete_links = []
/Users/wflanagan/.rvm/gems/ruby-1.9.2-p290@marketfu/bundler/gems/nokogiri-bd52db9bc49e/lib/nokogiri/xml/node.rb:830: [BUG] Segmentation fault
ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-darwin11.2.0]
-- control frame ----------
c:0031 p:---- s:0136 b:0136 l:000135 d:000135 CFUNC :native_write_to
c:0030 p:0250 s:0129 b:0129 l:000128 d:000128 METHOD /Users/wflanagan/.rvm/gems/ruby-1.9.2-p290@marketfu/bundler/gems/nokogiri-bd52db9bc49e/lib/nokogiri/xml/node.rb:830
c:0029 p:0183 s:0119 b:0119 l:000118 d:000118 METHOD /Users/wflanagan/.rvm/gems/ruby-1.9.2-p290@marketfu/bundler/gems/nokogiri-bd52db9bc49e/lib/nokogiri/xml/node.rb:752
c:0028 p:0054 s:0110 b:0110 l:000109 d:000109 METHOD /Users/wflanagan/.rvm/gems/ruby-1.9.2-p290@marketfu/bundler/gems/nokogiri-bd52db9bc49e/lib/nokogiri/html/document.rb:64
c:0027 p:0149 s:0106 b:0106 l:000105 d:000105 METHOD /Users/wflanagan/.rvm/gems/ruby-1.9.2-p290@marketfu/bundler/gems/nokogiri-bd52db9bc49e/lib/nokogiri/xml/node.rb:769
c:0026 p:0159 s:0102 b:0102 l:000101 d
@wflanagan
wflanagan / term_extract.rb
Created January 14, 2012 15:45
term extraction
def term_extract
p4 = phrases(4)
p3 = phrases(3)
p3_delete_list = []
p4_keys = p4.keys
p3_keys = p3.keys
p4_keys.each do |pkey|
p3_keys.each do |check_key|
if pkey.include?(check_key)
p3_delete_list << check_key
@wflanagan
wflanagan / entity.rb
Created January 15, 2012 13:16
I am having a problem adding a hash to an Existing Mongoid Document
class Entity
include Mongoid::Document
#normal fields
scores, :type => Hash, :default => {}
def fix_scores
self.scores = {} if scores.nil?
self.save
end
@wflanagan
wflanagan / console example.txt
Created January 15, 2012 14:17
Example of error message
>> a = Entity.first
=> #<Entity _id: 4eaee481f575482978000005, created_at: 2011-10-31 18:10:09 UTC, updated_at: 2012-01-11 21:38:45 UTC, name: "http://paintballgunsforsale.blogspot.com/", twitter: nil, facebook: nil, wordsmaster_ids: [], keywords: ["cheap paintball guns for sale", "bridal jewelry", "flower wedding favors", "wedding hair accessories"], forums: [], directory_ids: [], references: [], mentions: [], profiles_retrieved: true, profile_url: "NOURL", company: nil, title: nil, host_names: nil, data: {}, demographics: {}, geographics: {}, description: "No description", ignore_project_ids: [], found_at_url: "http://paintballgunsforsale.blogspot.com/", new_profiles: [{"service"=>"blogger.com", "user_id"=>"02503684639969694595", "score"=>0.0}, {"service"=>"twitter", "user_id"=>nil, "score"=>1.0}], keyword_scores: nil, presence_score: nil, profiles: [{"_id"=>"4eaee481f575482978000004", "created_at"=>2011-11-05 19:43:59 UTC, "data"=>{}, "paintballgunsforsale#blogspot#com"=>"http://paintballgunsforsale.blogsp
@wflanagan
wflanagan / firebug_output.txt
Created January 16, 2012 14:14
How do I get the value of a clicked span in JQuery?
>>> $(".todo_points").click(function () { var tod...d.split("_"); console.info(this.text()); });
[span#todo_points_4f0f429ef96f7a1b8e000002.todo_points, span#todo_points_4f0f5e72f96f7a24f8000001.todo_points, span#todo_points_4f122bf9f96f7a6af8000035.todo_points]