Skip to content

Instantly share code, notes, and snippets.

@andynu
Created August 24, 2016 18:14
Show Gist options
  • Select an option

  • Save andynu/db0302d4d489d17a47d642bad91ef991 to your computer and use it in GitHub Desktop.

Select an option

Save andynu/db0302d4d489d17a47d642bad91ef991 to your computer and use it in GitHub Desktop.
Encoding error details
str = "ó bhí mé óg, thaitin leabhair liom"
bytes = str.bytes
targets = ["贸","铆","茅","贸"].uniq
p :original, str
w = str.force_encoding('ISO-8859-1')
#p :iso88591, w
#p :utf8,w.encode("UTF-8")
#p :utf16,w.encode("UTF-16")
p bytes.pack('c*').force_encoding('UTF-8')
Encoding.name_list.each do |enc|
encoded = bytes.pack('c*').force_encoding(enc).encode('UTF-8') rescue ""
if targets.all?{|t| encoded.include?(t)}
p [encoded, enc]
end
end
:original
"ó bhí mé óg, thaitin leabhair liom"
"ó bhí mé óg, thaitin leabhair liom"
["贸 bh铆 m茅 贸g, thaitin leabhair liom", "GB2312"]
["贸 bh铆 m茅 贸g, thaitin leabhair liom", "GB18030"]
["贸 bh铆 m茅 贸g, thaitin leabhair liom", "GBK"]
["贸 bh铆 m茅 贸g, thaitin leabhair liom", "EUC-CN"]
["贸 bh铆 m茅 贸g, thaitin leabhair liom", "eucCN"]
["贸 bh铆 m茅 贸g, thaitin leabhair liom", "CP936"]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment