Skip to content

Instantly share code, notes, and snippets.

@jarib
Created April 16, 2009 04:46
Show Gist options
  • Save jarib/96226 to your computer and use it in GitHub Desktop.
Save jarib/96226 to your computer and use it in GitHub Desktop.
UTF-8
$ echo æåø | xxd
0000000: c3a6 c3a5 c3b8 0a .......
$ ruby -r stringio -e 's=StringIO.new; s.puts("æåø"); puts s.string' | xxd
0000000: c3a6 c3a5 c3b8 0a .......
StringIO#puts, file.encoding=MacRoman (default) - no idea what the encoding is here
$ jruby -r stringio -e 's=StringIO.new; s.puts("æåø"); puts s.string' | xxd
0000000: 3f3f c00a ??..
StringIO#puts, file.encoding=UTF-8 - the original UTF-8 string has been converted from ISO-8859-1 to UTF-8
$ jruby -J-Dfile.encoding=utf8 -r stringio -e 's=StringIO.new; s.puts("æåø"); puts s.string' | xxd
0000000: c383 c2a6 c383 c2a5 c383 c2b8 0a .............
$ echo æåø | iconv -f iso-8859-1 -t utf-8 | xxd
0000000: c383 c2a6 c383 c2a5 c383 c2b8 0a .............
StringIO#<<, file.encoding=MacRoman (default) - output is MacRoman
$ jruby -r stringio -e 's=StringIO.new; s << "æåø"; puts s.string' | xxd
0000000: be8c bf0a ....
StringIO#<<, file.encoding=utf8 - output is correct UTF-8
$ jruby -J-Dfile.encoding=utf8 -r stringio -e 's=StringIO.new; s << "æåø"; puts s.string' | xxd
0000000: c3a6 c3a5 c3b8 0a .......
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment