Created
August 24, 2016 18:14
-
-
Save andynu/db0302d4d489d17a47d642bad91ef991 to your computer and use it in GitHub Desktop.
Encoding error details
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| str = "ó bhí mé óg, thaitin leabhair liom" | |
| bytes = str.bytes | |
| targets = ["贸","铆","茅","贸"].uniq | |
| p :original, str | |
| w = str.force_encoding('ISO-8859-1') | |
| #p :iso88591, w | |
| #p :utf8,w.encode("UTF-8") | |
| #p :utf16,w.encode("UTF-16") | |
| p bytes.pack('c*').force_encoding('UTF-8') | |
| Encoding.name_list.each do |enc| | |
| encoded = bytes.pack('c*').force_encoding(enc).encode('UTF-8') rescue "" | |
| if targets.all?{|t| encoded.include?(t)} | |
| p [encoded, enc] | |
| end | |
| end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| :original | |
| "ó bhí mé óg, thaitin leabhair liom" | |
| "ó bhí mé óg, thaitin leabhair liom" | |
| ["贸 bh铆 m茅 贸g, thaitin leabhair liom", "GB2312"] | |
| ["贸 bh铆 m茅 贸g, thaitin leabhair liom", "GB18030"] | |
| ["贸 bh铆 m茅 贸g, thaitin leabhair liom", "GBK"] | |
| ["贸 bh铆 m茅 贸g, thaitin leabhair liom", "EUC-CN"] | |
| ["贸 bh铆 m茅 贸g, thaitin leabhair liom", "eucCN"] | |
| ["贸 bh铆 m茅 贸g, thaitin leabhair liom", "CP936"] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment