Skip to content

Instantly share code, notes, and snippets.

@fabrizioc1
Created November 21, 2011 17:41
Show Gist options
  • Save fabrizioc1/1383340 to your computer and use it in GitHub Desktop.
Save fabrizioc1/1383340 to your computer and use it in GitHub Desktop.
Fixing invalid UTF-8 characters
# first method
def enforce_utf8(from = nil)
begin
self.is_utf8? ? self : Iconv.iconv('utf8', from, self).first
rescue
converter = Iconv.new('UTF-8//IGNORE//TRANSLIT', 'ASCII//IGNORE//TRANSLIT')
converter.iconv(self).unpack('U*').select{ |cp| cp < 127 }.pack('U*')
end
end
# second method
ic = Iconv.new('UTF-8//IGNORE', 'UTF-8')
valid_string = ic.iconv(untrusted_string + ' ')[0..-2]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment