Skip to content

Instantly share code, notes, and snippets.

@fronx
Created April 14, 2009 07:04
Show Gist options
  • Select an option

  • Save fronx/95031 to your computer and use it in GitHub Desktop.

Select an option

Save fronx/95031 to your computer and use it in GitHub Desktop.
require 'rubygems'
require 'hpricot'
require 'open-uri'
class String
def strip_tags
self.gsub(/<[^<>]+>/,'')
end
end
class WebPage
attr_reader :content
def initialize
@content = yield
end
def self.from_file(path)
new { File.open(path) { |f| f.read } }
end
def self.from_url(url)
new { open(url) { |f| f.read } }
end
def select(selector, strip_tags = true)
(page/selector).map { |i| strip_tags ? i.to_s.strip_tags : i.to_s }
end
def page
@page ||= Hpricot(@content)
end
end
# sample:
#
# p WebPage.from_url("http://www.google.com/analytics/features.html").select("h3")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment