Skip to content

Instantly share code, notes, and snippets.

@chorn
Created March 22, 2012 13:03
Show Gist options
  • Select an option

  • Save chorn/2158244 to your computer and use it in GitHub Desktop.

Select an option

Save chorn/2158244 to your computer and use it in GitHub Desktop.
#!/usr/bin/env ruby
require 'nokogiri'
require 'open-uri'
require 'ap'
doc = Nokogiri::HTML(open('http://barcamproc.org/sponsor/'))
targets = ['money', 'fall']
found_nodes = []
found_blurbs = []
doc.xpath('//*').each do |node|
next if node.path == '/html' || node.path == '/html/body'
targets.each do |target|
if node.content =~ /#{target}/i
found_nodes << node
end
end
end
found_nodes.uniq.each do |node|
found_blurbs << node.content.strip.gsub(/[\r\n\t\s]+/, ' ') if node.content
end
ap found_blurbs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment