Skip to content

Instantly share code, notes, and snippets.

@aurelian
Created January 3, 2011 10:11
Show Gist options
  • Select an option

  • Save aurelian/763323 to your computer and use it in GitHub Desktop.

Select an option

Save aurelian/763323 to your computer and use it in GitHub Desktop.
nokogiri with pipes
#!/usr/bin/env ruby
#
# Nokoe: Nokogiri with pipes
#
# Requirements:
#
# gem install trollop
# gem install nokogiri
#
require 'rubygems'
require 'trollop'
require 'nokogiri'
require 'open-uri'
opts= Trollop::options do
version "#{$0} - 1.0 (c) aurelian 2011-2012"
banner <<-EOS
#{$0} - Nokogiri (http://nokogiri.org/) with pipes.
Usage: #{$0} <file> [options]
$ cat file.html | #{$0} --selector "//body" > foo.html
$ #{$0} --selector "nav#siteLinks" < file.html > foo.html
$ #{$0} --file "http://google.com" --selector "//body/a/*"
$ #{$0} --file "http://twitter.com" --irb # like `nokogiri http://twitter.com`
$ #{$0} http://twitter.com --irb # same as above, without using --file.
Options:
EOS
opt :file, "filename/url to read in", :type => String
opt :selector, "Nokogiri selector", :default => "/", :type => String
opt :irb, "drops you to IRB, @doc holds the document", :default => false
opt :encoding, "encoding", :default => "utf-8", :type => String
end
if $stdin.stat.size > 0
text= $stdin.read
$stdin.reopen(File.open("/dev/tty", "r")) if opts[:irb]
elsif file= opts[:file] || ARGV.shift
begin
text= open(file).read
rescue StandardError => error
Trollop::die(:file, "#{file} must _exist_ dammit!\n\t#{error.message}")
end
else
Trollop::die :file, "cannot read incoming stream" # be creative.
end
@doc= Nokogiri.parse(text, nil, opts[:encoding]).search(opts[:selector])
if opts[:irb]
require 'irb'
puts "-- starting irb session. @doc is your document."
IRB.start
else
$stdout << @doc
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment