Skip to content

Instantly share code, notes, and snippets.

@revskill10
Forked from phoet/char_converter.rb
Created February 25, 2013 08:00
Show Gist options
  • Select an option

  • Save revskill10/5028404 to your computer and use it in GitHub Desktop.

Select an option

Save revskill10/5028404 to your computer and use it in GitHub Desktop.
# encoding: UTF-8
require 'iconv'
require 'uri'
module Support
class CharConverter
BAD_GA_ENCODINGS = {
"%u00DF" => "ß",
"%u00E4" => "ä",
"%u00F6" => "ö",
"%u00FC" => "ü",
"%uFFFD" => "ä"
}
CONVERTER = Iconv.new('UTF-8', 'windows-1252')
def initialize(app)
@app = app
end
def call(env)
@app.call(self.class.kill_windows_encoding(env))
end
def self.kill_windows_encoding(env)
["QUERY_STRING", "REQUEST_URI", "HTTP_REFERER", "HTTP_COOKIE"].each do |key|
next unless value = env[key]
value = kill_windows_encoding_for_string(value)
value = kill_ga_windows_encoding_for_string(value)
env[key] = value
end
env
end
def self.kill_ga_windows_encoding_for_string(string)
# bad cookie encodings kill rack https://github.com/rack/rack/issues/225
BAD_GA_ENCODINGS.inject(string) { |s, entry| s.gsub(/#{entry.first}/, entry.last) }
end
def self.kill_windows_encoding_for_string(string)
return nil unless string
# decode URI from %E4 to \xE4
decoded = URI.decode(string)
# force UTF-8 cause it usally works within ascii 8-bit
unless decoded.force_encoding('UTF-8').valid_encoding?
begin
# wild guess that it's a fucking windows encoding
string = URI.encode(CONVERTER.iconv(decoded))
rescue
# we don't care
puts $!
end
end
string
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment