Skip to content

Instantly share code, notes, and snippets.

@e2
Created April 23, 2010 16:00
Show Gist options
  • Save e2/376729 to your computer and use it in GitHub Desktop.
Save e2/376729 to your computer and use it in GitHub Desktop.
# encoding: utf-8
def test(a,b)
begin
result = ('テ'.encode(a) << 'テ'.encode(b)).encode('utf-8')
rescue Encoding::CompatibilityError
result = "FAILED!"
end
p "Example: #{a} + #{b}: #{result}"
end
#work
test('utf-8','utf-8')
test('euc-jp','euc-jp')
#fail
test('utf-8','euc-jp')
test('euc-jp','utf-8')
test('euc-jp','sjis')
test('sjis', 'euc-jp' )

Using multiple encodings in one application

Multiple encodings should not be used simultaneously in Rails.

Example problem:

We have two html partial templates with the correct 1.9 encoding magic:

  • one in Shift-JIS
  • one in EUC-JP

and a layout file with UTF-8 content-type.

This cannot be fixed because Ruby 1.9 considers the encodings incompatible and won’t allow concatenation without first converting each string into a common encoding (Encoding::CompatibilityError).

The example.rb shows what works and what doesn’t.

There are 2 solutions to this:

  1. Encode strings to a common encoding just before concatenation.
  2. Do not allow strings in incompatible character sets in Rails (where concatenation can occur).

Solution 1:

- error occurs too far from the source

- performance hit when rendering templates

- environment setup problems are hidden

+ easy workaround in output_safety (for templates)

Solution 2:

- more work than solution 1

+ proper solution

This is why I believe the policy would be to attempt to convert all strings from external sources to a common encoding, to Encoding::default_internal (or Encoding::default_external if nil) for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment