Problem solved by geniuses on IRC: check final comment for complete log. The cause of this bug is unexpected to say the least!
This is the code that isn't working:
Assume the input is "Hello world @123"
def memo_content(input, remote: false)
content = sanitize input
content = content.gsub(/@([a-z0-9A-Z]+)/) do |match|
binding.pry
link_to "@#{ $1 }", memo_path($1), class: "marker", remote: remote
end
content = simple_format content, { }, sanitize: false
end
When it gets to the pry binding, I check what the "match" and "$1" are holding:
[1] pry> match
=> "@W"
[2] pry> $1
=> nil
[3] pry> content
=> "Hello world @123"
However, when I remove the second line:
def memo_content(input, remote: false)
content = input.gsub(/@([a-z0-9A-Z]+)/) do |match|
binding.pry
link_to "@#{ $1 }", memo_path($1), class: "marker", remote: remote
...
end
It works jusst fine, and "$1" holds "123" as it should.
What exactly is happening here? Why does it say "$1" is nill when clearly there was a match?
God bless the ROR Freenode:
<wtf_gsub> Can someone help with this really strange gsub behavior? https://gist.github.com/CLUSTERfoo/6258984
you removed the line that said 'content = sanitize input' ?
wtf_gsub: what does sanitize(input) do?
what does sanitize do?
<wtf_gsub> to that string, nothing. http://api.rubyonrails.org/classes/ActionView/Helpers/SanitizeHelper.html#method-i-sanitize
<wtf_gsub> if I check input == content, I get "true"
<wtf_gsub> it also correctly matched "@123"
<wtf_gsub> it's just giving me nil when I ask for $1, which makes no sense because it's part of the regex definition
wtf_gsub: put your binding.pry at the top of the method and step through it all
is $1 available in that block?
$1 is available everywhere
are you sure?
<wtf_gsub> if I do the second example (in which that block does not change), $1 returns "123" as it should
omarqureshi: from the docs: "In the block form, the current match string is passed in as a parameter, and variables such as $1,$2, $ `, $&, and $ ' will be set appropriately. "
omarqureshi: yes -- it's a global
but will they be set IN the block or after ?
omarqureshi: $1 is always in relation to the last matched regexp, so it's there in. plus wtf_gsub says it works without the sanitize
https://gist.github.com/omarqureshi/15640e9b71205fd65f54
is weird
no puts
huh.
<wtf_gsub> omarqureshi: I mean, there is literally 0 difference between the first example and the second, other than the string going through "sanitize" first, as far as I can tell. So if it works in the second it should work in the first. I'm a t a loss :( -- is this a ruby bug?
omarqureshi: you didn't set a matching group in that
ah, im stupid
wtf_gsub: did you do as i suggested?
"foo".gsub(/(o)/) {|match| puts "LOL"; puts match} works as expected
urgh
so
omarqureshi: so does puts $1, so wtf_gsub's problem is very weird
indeed
<wtf_gsub> pontiki: if I put binding.pry on top and then try sanitize it just returns the same string. sanitize isn't doing anything to that particular string
wtf_gsub: there needs to be more to this, simple calling sanitize couldn't cause that behavior :/
wtf_gsub: which ruby version?
wtf_gsub: step all the way through, testing it at each point, see if there's something else going on
2.0
<wtf_gsub> pontiki: I know :( I've copied and pasted straight from my vim and from terminal
i can't see why it wouldn't work
but as it doesn't, see where it it going wrong
pontiki: try rubinius and see if it still happens, you can step through the gsub source :)
wtf_gsub: ^
<wtf_gsub> pontiki: ruby 2.0.0 p247
eh?
pontiki: I meant wtf_gsub but addressed you, wtf_gsub meant omarqureshi but addressed you
why W matches is weird anyway
wtf_gsub: play around some more in pry and gist the bash log
<wtf_gsub> DouweM: ok, here is the log (first comment) https://gist.github.com/CLUSTERfoo/6258984
wtf_gsub: :/
you can have as many routes as you'd like -- they don't tie directly to any model
The Cause is Discovered:
wtf_gsub: rails/rails#1555
here's my question: why do they have to be separate routes?
no, really
pontiki: I'm not aware of any other reason
wtf_gsub: I started thinking about what sanitize actually does to a string, and then I though of html_safe which turns it into a SafeBuffer
wtf_gsub: https://www.ruby-forum.com/topic/198458
pontiki: about the gsub thing, looks like the problem is in $1 not being as global as we'd like. apparently it only lives for two frames, see https://www.ruby-forum.com/topic/198458
pontiki: they're also thread-local, which should give away their not-actually-being-globals
<wtf_gsub> DoueM !! oh my
wtf_gsub: the rails thread is an interesting read. apparently there is no fix for this, and it breaks a lot of stuff with html_safe strings
<wtf_gsub> Douwem: :( uh oh
DouweM: where are they wrapped in that construct? i'd suppose they'd have gone out of scope after the sanitize assuming they're used there (likely)
wtf_gsub: solution in your case: .to_str before the gsub, html_safe after
but i'd think they'd still be fully scoped correctly inside the gsub! block
pontiki: it's because of sanitize calling .html_safe on the string, turning it into a SafeBuffer, thus introducing the wrap around super: rails/rails#1555
fascinating
AH!
it's a differnt type!
that's what i was hoping to find
well, not me, personally
pontiki: yeah, that was my lightbulb. I started thinking about what sanitize actually did, and then I thought of html_safe
:)
nodnodnod
well done
bows, accepts applause
pontiki: but the worst thing is that there is apparently no fix for this. it's literally impossible to subclass String, override gsub and have it work properly
<wtf_gsub> Douwem: where does the to_str go? I'm affraid I'm not savvy enough to fully understand what's happening :/
A Solution
wtf_gsub: well, if you cal gsub after html_safe (which happens inside sanitize), all hell will break loose. if you run to_str after html_safe however, you're safe again. so. run the gsub on sanitize(input).to_str
wtf_gsub: but because you still want an html_safe version (so Rails will allow you to print the HTML verbatim), you have to html_safe after the gsub to restore the html safety
wtf_gsub: so content.to_str.gsub(......).html_safe
<wtf_gsub> DouweM: omg! it works, I don't have much of an idea why, but I'll read more about it. Thank you so much!
wtf_gsub: read the ruby-forum thread :)