Skip to content

Instantly share code, notes, and snippets.

@rockpapergoat
Created March 21, 2012 02:56
Show Gist options
  • Save rockpapergoat/2143927 to your computer and use it in GitHub Desktop.
Save rockpapergoat/2143927 to your computer and use it in GitHub Desktop.
scraping with nokogiri
#!/usr/bin/env ruby
require 'nokogiri'
doc = Nokogiri::HTML(open("test.html"))
doc.xpath('//div[@class="level"]/strong').each {|node| puts node.text}
doc.xpath('//div[@class="level"]').each {|node| puts node.text.strip}
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>
blorg
</title>
<meta name="generator" content="TextMate http://macromates.com/">
<meta name="author" content="nate"><!-- Date: 2012-03-20 -->
</head>
<body>
<div id="it_status">
<div class="level">
<strong>ALERT-1</strong>
</div>
<div class="level">
bob
</div>
</div>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment