Skip to content

Instantly share code, notes, and snippets.

@bootstraponline
Last active December 20, 2015 01:09
Show Gist options
  • Save bootstraponline/6047083 to your computer and use it in GitHub Desktop.
Save bootstraponline/6047083 to your computer and use it in GitHub Desktop.
XML to txt
# gem install nokogiri escape_utils
text = ''
reader = Nokogiri::XML::Reader(File.open(file))
reader.each do |n|
text += ' ' + n.value + ' ' if n.value?
end
# double unescape
text = EscapeUtils.unescape_html text
text = EscapeUtils.unescape_html text
# add return before >>>
text.gsub!(/([^>])(>+)/, "\\1\n\\2")
text.gsub!(/ +/, ' ') # replace multi-space with single space
File.open(file.path + '.txt', 'w') { |f| f.write text }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment