Skip to content

Instantly share code, notes, and snippets.

@sandys
Created October 18, 2012 10:04
Show Gist options
  • Save sandys/3910840 to your computer and use it in GitHub Desktop.
Save sandys/3910840 to your computer and use it in GitHub Desktop.
convert a html table to CSV using ruby
# run using ```rvm jruby-1.6.7 do jruby "-J-Xmx2000m" "--1.9" tej.rb```
require 'rubygems'
require 'nokogiri'
require 'csv'
f = File.open("/tmp/preview.html")
doc = Nokogiri::HTML(f)
csv = CSV.open("/tmp/output.csv", 'w',{:col_sep => ",", :quote_char => '\'', :force_quotes => true})
#doc.xpath('//table/tbody/tr').take(10).each do |row|
doc.xpath('//table/tbody/tr').each do |row|
tarray = []
row.xpath('td').each do |cell|
tarray << cell.text
end
csv << tarray
end
csv.close
@isorsa
Copy link

isorsa commented Jan 20, 2016

Thanks!

@mahendhar9
Copy link

Many Thanks!

@shrishti01
Copy link

Hi can anyone provide w with the script which convert full html file in csv file including tables and text

@debazav
Copy link

debazav commented May 18, 2018

THANKS!!

@phucnx190902
Copy link

So good!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment