Created
August 13, 2012 15:33
-
-
Save EricLondon/3341909 to your computer and use it in GitHub Desktop.
BBB Parser
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'nokogiri' | |
require 'open-uri' | |
require 'pp' | |
url = "http://www.bedbathandbeyond.com/product.asp?sku=40354439&" | |
doc = Nokogiri::HTML(open(url)) | |
data = {} | |
data[:product_title] = doc.css('h1.producttitle').first.children.to_s | |
data[:product_info] = doc.css('.ppprodinfo').first.children.to_s | |
data[:product_info] = doc.css('.ppprodinfo').first.children.to_s.gsub(/<script.*?>[\s\S]*<\/script>/i,'') | |
product_details_table = {} | |
doc.css('table#Table10 input[type="hidden"]').each do |i| | |
name = i.attributes['name'].to_s | |
value = i.attributes['value'].to_s | |
product_details_table[name.to_sym] = value | |
end | |
data[:product_details_table] = product_details_table | |
pp data | |
# $ ruby main.rb | |
# {:product_title=>"5-Piece Ceramic Spoon Set with Acacia Tray", | |
# :product_info=> | |
# "\r\n\t\t\t\r\n\t\t\t<h1 class=\"producttitle\">5-Piece Ceramic Spoon Set with Acacia Tray</h1>\n<br><br>This 5-piece set includes one 10 1/5\" L acacia tray and four 3 1/2\" ceramic spoon ladles. Ladles make it easy to serve varying sauces to guests or even provide each guest with their own condiment bowl. Naturally stunning and totally fresh, this acacia wood serving set tops the table with easy modern style. Rich wood grain contrasts crisp white ceramic bowls that are perfect for serving different salsas, sundae toppings, condiments, dipping sauces and more. Tray features indentations to keep bowls safely in position for serving. Hand wash. Imported.\r\n", | |
# :product_details_table=> | |
# {:price0=>"24.99", | |
# :STORE0=>"", | |
# :RN=>"1003", | |
# :retloc=>"product.asp", | |
# :retarg=>"SKU=40354439", | |
# :SKU=>"40354439", | |
# :COL=>"-99", | |
# :shipSurchAttrValueId=>"0", | |
# :numSKULines=>"1", | |
# :SKU0=>"40354439"}} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment