On Ruby 2.2.0dev URI.parse
has been changed so that it uses RFC3986. This changes the semantics of URIs in some subtle ways. Probably most importantly it means that square brackets "[", "]" and a few other characters should be percent-encoded, primarily in the query string.
Unfortunately the implementation on ruby-trunk doesn't provide an encoding functionality. I raised a bug explaining this: https://bugs.ruby-lang.org/issues/9990 but here's a quick script to demonstrate the issue:
url = "https://bugs.ruby-lang.org/projects/ruby-trunk/issues?set_filter=1&f[]=status_id&op[status_id]=o"
puts URI.encode(url)
URI.parse(URI.encode(url))
See https://gist.github.com/lengarvey/c1d17913f9ea95fd999c for the output of this code.
Currently, URI.escape
still points to the "DEFAULT_PARSER
" (which is no longer default for most operations) which doesn't encode uris with square brackets, and URI.parse
won't accept those uris because they aren't escaped properly.
The fixed parser above provides a naive and first cut implementation at properly splitting, escaping and parsing uris. First it attempts to use the existing RFC3986 parsing implementation, if that fails to work then it performs a non-validating split of the uri, percent encodes the query string and repeats. You can see a demonstration of this in irb_output
I'm really bad at reading RFCs. They seem to be designed to be incomprehensible so I'm certain I've not included some stuff which should be percent encoded. This code is just my first cut at fixing this issue in a way I think would work for most. I'm not sure if URI.escape
could be pointed towards #naive_escape
mostly because I don't understand the RFC in enough depth to be able to tell and haven't written enough tests to give myself more confidence that it would work.