Skip to content

Instantly share code, notes, and snippets.

@bf4
Created July 8, 2014 15:58
Show Gist options
  • Save bf4/d26259acfa29f3b9882b to your computer and use it in GitHub Desktop.
Save bf4/d26259acfa29f3b9882b to your computer and use it in GitHub Desktop.
invalid %-encoding error in application for malformed uri

Question re: should this be rack's job

# config.ru
run ->(env) {
  [ 200, { 'Content-Type' => 'text/html', }, [Rack::Request.new(env).params.inspect] ]
}

rackup

open http://localhost:9292/?utf8=%E2%9C%93&search=&page=2&commit=%u00ABscript%u00BBalert(209);%u00AB/script%u00BB

ArgumentError at /
invalid %-encoding (%u00ABscript%u00BBalert(209))

Ruby	$HOME/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/uri/common.rb: in decode_www_form_component, line 898
Web	GET localhost/

Would you argue that rackup is not an acceptable app server, then? If not, would you accept a PR to rescue the Argument error in rack, warn that it's the app server's job to fix the issue, and re-raise it?


edit: I forgot posted a similar comment in rack/rack#323 (comment)

require 'rack'
require 'logger'
class ExceptionApp
DEFAULT_CONTENT_TYPE = 'text/html'
DEFAULT_CHARSET = 'utf-8'
attr_reader :logger
def initialize(app, stdout=STDOUT)
@app = app
@logger = defined?(Rails.logger) ? Rails.logger : Logger.new(stdout)
end
def call(env)
begin
# calling env.dup here prevents bad things from happening
request = Rack::Request.new(env.dup)
# calling request.params is sufficient to trigger the error
# see https://github.com/rack/rack/issues/337#issuecomment-46453404
request.params
@app.call(env)
rescue ArgumentError => e
raise unless e.message =~ /invalid %-encoding/
message = "BAD REQUEST: Returning 400 due to #{e.message} from request with env #{request.inspect}"
logger.info message
content_type = env['HTTP_ACCEPT'] || DEFAULT_CONTENT_TYPE
status = 400
body = "Bad Request"
return [
status,
{
'Content-Type' => "#{content_type}; charset=#{DEFAULT_CHARSET}",
'Content-Length' => body.bytesize.to_s
},
[body]
]
end
end
end
require_relative 'exception_app'
describe ExceptionApp do
let(:app) do
ExceptionApp.new( ->(env){ [418, {}, ["I'm a teapot."]] }, '/dev/null')
end
let(:bad_param) { "commit=%u00ABscript%u00BBalert(209);%u00AB/script%u00BB" }
let(:env) do
{
"REQUEST_METHOD"=>"GET",
"QUERY_STRING"=> bad_param,
"REQUEST_URI"=>"/?#{bad_param}",
"PATH_INFO"=>"/",
"HTTP_ACCEPT"=>"application/json",
"rack.input" => ""
}
end
it "returns an early 'Bad Request' when given params with a bad encoding" do
status, headers, body = app.call(env)
expect(body.join).to eq("Bad Request")
expect(status).to eq(400)
expect(headers["Content-Type"]).to eq("application/json; charset=utf-8")
expect(headers["Content-Length"]).to eq("11")
end
it "does nothing when no encoding error if found" do
status, headers, body = app.call(env.merge("QUERY_STRING" => "", "REQUEST_URI" => "/"))
expect(body.join).to eq("I'm a teapot.")
expect(status).to eq(418)
expect(headers["Content-Type"]).to be_nil
expect(headers["Content-Length"]).to be_nil
end
end
@pioz
Copy link

pioz commented Jul 13, 2014

@bf4 Can you create a pull request with this fix to https://github.com/whitequark/rack-utf8_sanitizer ?

@rajesh38
Copy link

I found this solution which is more compact and working well:

class Utf8Sanitizer
  SANITIZE_ENV_KEYS = %w(
    HTTP_REFERER
    PATH_INFO
    REQUEST_URI
    REQUEST_PATH
    QUERY_STRING
  )

  def initialize(app)
    @app = app
  end

  def call(env)
    SANITIZE_ENV_KEYS.each do |key|
      string = env[key].to_s
      valid = URI.decode(string).force_encoding('UTF-8').valid_encoding?
      # Don't accept requests with invalid byte sequence
      return [ 400, { }, [ 'Bad request' ] ] unless valid
    end

    @app.call(env)
  end
end

link: http://dev.mensfeld.pl/2014/03/rack-argument-error-invalid-byte-sequence-in-utf-8/

Tell me which one is better.

@bschaeffer
Copy link

@bf4 What are the bad things env.dup prevents from happening?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment