Skip to content

Instantly share code, notes, and snippets.

@christophermlne
Created June 27, 2018 01:03
Show Gist options
  • Save christophermlne/ed3b59af1a5dcdbf176269d5ebc06428 to your computer and use it in GitHub Desktop.
Save christophermlne/ed3b59af1a5dcdbf176269d5ebc06428 to your computer and use it in GitHub Desktop.
Ruby function to find a valid encoding for a text file
def open_text_file_with_valid_encoding!(file_path, encoding='UTF-8')
# Tries to find a valid encoding for a text file.
# Converts return value to UTF-8.
encodings = ['UTF-8', 'ISO-8859-1']
contents = File.open(file_path, "r:#{encoding}") { |file| file.read }
unless contents.valid_encoding?
if next_encoding = encodings[encodings.index(encoding) + 1]
puts "Warning: Encoding #{encoding} for #{file_path} is not valid. "\
open_text_file_with_valid_encoding!(file_path, next_encoding)
else
raise "No valid text encodings found for #{file_path}"
end
end
contents
.force_encoding(encoding)
.encode!('UTF-8', invalid: :replace, replace: '')
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment