Created
May 30, 2011 00:17
-
-
Save inspire22/998274 to your computer and use it in GitHub Desktop.
sanitize replace & with & on whitelisted elements
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
my processor: | |
OK_FLASH = /^http:\/\/(?:www\.)?youtube\.com\/v\/|^http:\/\/(?:\w+\.)?grooveshark\.com\/songWidget.swf/ | |
# how to easily replace @ reload without reloading full server? | |
T_YOUTUBE = lambda do |env| | |
node = env[:node] | |
node_name = env[:node_name] | |
parent = node.parent | |
return nil unless (node_name == 'param' || node_name == 'embed') && | |
parent.name.to_s.downcase == 'object' | |
if node_name == 'param' | |
# Quick XPath search to find the <param> node that contains the video URL. | |
return nil unless movie_node = parent.search('param[@name="movie"]')[0] | |
url = movie_node['value'] | |
else | |
# Since this is an <embed>, the video URL is in the "src" attribute. No | |
# extra work needed. | |
url = node['src'] | |
end | |
#d "testing url #{url}" | |
return nil unless url =~ OK_FLASH | |
#d "allowing #{url}" | |
# strip autoplays. hmm, removing p=1 could be dangerous? | |
node['value'] = node['value'].gsub(/autoplay=1|p=1/i,"").gsub(/&/,"&") if node['value'] | |
node['src'] = node['src'].gsub(/autoplay=1|p=1/i,"").gsub(/&/,"&") if node['src'] | |
#d "value, src", node['value'], node['src'] | |
# We're now certain that this is a YouTube embed, but we still need to run | |
# it through a special Sanitize step to ensure that no unwanted elements or | |
# attributes that don't belong in a YouTube embed can sneak in. | |
# flashvars | |
#egads, starting to look more sensible to replace allowed elements to [grooveshark 1221111] again. | |
Sanitize.clean_node!(parent, { | |
:elements => ['embed', 'object', 'param'], | |
:attributes => { | |
'embed' => ['allowfullscreen', 'allowscriptaccess', 'height', 'src', 'type', 'width','flashvars','type'], | |
'object' => ['height', 'width'], | |
'param' => ['name', 'value'] | |
} | |
}) | |
# so even though embed is disallowed, this will whitelist this node. need to pass node to have changes reflected | |
# Now that we're sure that this is a valid YouTube embed and that there are | |
# no unwanted elements or attributes hidden inside it, we can tell Sanitize | |
# to whitelist the current node (<param> or <embed>) and its parent | |
# (<object>). | |
{:node_whitelist => [node, parent]} | |
end if !defined? T_YOUTUBE | |
embed: | |
<object width="250" height="40"> <param name="movie" value="http://grooveshark.com/songWidget.swf" /> <param name="wmode" value="window" /> <param name="allowScriptAccess" value="always" /> <param name="flashvars" value="hostname=cowbell.grooveshark.com&widgetID=25096656&style=metal&p=0" /> <embed src="http://grooveshark.com/songWidget.swf" type="application/x-shockwave-flash" width="250" height="40" flashvars="hostname=cowbell.grooveshark.com&widgetID=25096656&style=metal&p=0" allowScriptAccess="always" wmode="window" /></object> | |
results: | |
*mystring.rb:207 --- | |
- value, src | |
- http://grooveshark.com/songWidget.swf | |
- | |
*mystring.rb:207 --- | |
- value, src | |
- window | |
- | |
*mystring.rb:207 --- | |
- value, src | |
- always | |
- | |
*mystring.rb:207 --- | |
- value, src | |
- hostname=cowbell.grooveshark.com&widgetID=25096656&style=metal&p=0 | |
- | |
*mystring.rb:207 --- | |
- value, src | |
- | |
- http://grooveshark.com/songWidget.swf | |
*mystring.rb:457 --- | |
- after cleaning | |
- |- | |
adsf | |
<object width="250" height="40"> <param name="movie" value="http://grooveshark.com/songWidget.swf"> | |
<param name="wmode" value="window"> | |
<param name="allowScriptAccess" value="always"> | |
<param name="flashvars" value="hostname=cowbell.grooveshark.com&widgetID=25096656&style=metal&p=0"> | |
<embed src="http://grooveshark.com/songWidget.swf" type="application/x-shockwave-flash" width="250" height="40" flashvars="hostname=cowbell.grooveshark.com&widgetID=25096656&style=metal&p=0" allowscriptaccess="always"></embed></object> | |
Somehow the &'s get replaced with &, despite being fine when I'm checking the 'value' parameter inside the filter, and whitelisting the current node. | |
Any tips welcome & thanks for a great project! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment