Created
May 7, 2020 00:04
-
-
Save matthewfinnell/1954c082f9360100c9009072bb81365d to your computer and use it in GitHub Desktop.
How to fix "invalid byte sequence in UTF-8" errors when uploading data to ActionNetwork.org
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
outty_boy = File.open("output.csv", "w") | |
File.open("#{ARGV[0]}").each do |row| | |
row.encode!("UTF-8", row.encoding, undef: :replace, invalid: :replace, replace: "") | |
outty_boy << row | |
end | |
outty_boy.close |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Sometimes when you're loading email data to ActionNetwork.org the data you're using has invalid utf-8 characters for whatever reason. This ruby script will clean up single column csv files of any nuisance data and will allow you to successfully upload.
Haven't adapted to multiple column data, but use of the CSV standard library would likely take care of the heavy lifting.