Skip to content

Instantly share code, notes, and snippets.

@finsterthecat
Created May 25, 2011 20:53
Show Gist options
  • Save finsterthecat/991945 to your computer and use it in GitHub Desktop.
Save finsterthecat/991945 to your computer and use it in GitHub Desktop.
Simple REXML-SAX stream listener example
require 'rubygems'
require 'rexml/document'
require 'rexml/streamlistener'
include REXML
class NaicsSynonymListener
include StreamListener
def initialize(f)
@outfile = File.open(f, 'w')
@sql = <<SQL
insert into business_type_syn (business_type_syn_id, business_type_id, locale, description)
values (business_type_syn.nextval,
'EN',
(select business_type_id from business_type where naics_code = '$naics$'),
'$desc_en$');
insert into business_type_syn (business_type_syn_id, business_type_id, locale, description)
values (business_type_syn.nextval,
'FR',
(select business_type_id from business_type where naics_code = '$naics$'),
'$desc_fr$');
SQL
@count = 0
end
def tag_start(name, attrs)
case name
when "AlphaIndex_x002F_IndexAlpha"
@naics = @desc_en = @desc_fr = nil
@count +=1
end
end
def converted_text
@text.gsub(/\'/, "''")
end
def tag_end(name)
case name
when "AlphaIndex_x002F_IndexAlpha"
outfile.puts sql.gsub(/\$naics\$/, @naics).gsub(/\$desc_en\$/, @desc_en).gsub(/\$desc_fr\$/, @desc_fr)
outfile.puts "\ncommit;\n\n" if @count % 100 == 0
when "NAICS2007Coding"
@naics = @text
when "ENGLISH_DESCRIPTION"
@desc_en = converted_text
when "FRENCH_DESCRIPTION"
@desc_fr = converted_text
end
end
def text(text)
@text = text
end
end
if __FILE__ == $0
listener = NaicsSynonymListener.new( '../data/apply_naics_synonym_load.sql')
parser = Parsers::StreamParser.new(File.new("../data/naics_synonym.xml"), listener)
parser.parse
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment