Skip to content

Instantly share code, notes, and snippets.

@stevef
Created April 28, 2011 17:38
Show Gist options
  • Save stevef/946833 to your computer and use it in GitHub Desktop.
Save stevef/946833 to your computer and use it in GitHub Desktop.
irb output using sanitize gem
{11-04-28 15:35}[jruby-1.5.6]fortress:~/Sandbox/ruby sf% cat clean_bad_html.rb
require 'rubygems'
require 'sanitize'
white_list_elements = %w[
a b i em strong dfn code q samp kbd var cite abbr
acronym sub sup dl ul ol li blockquote p h1 h2 h3
h4 h5 h6 pre table tr th td img br
]
white_list_attributes = {
'a' => ['href', 'title'],
'img' => ['alt', 'src']
}
bad_html = <<-EOF
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML> <HEAD>  <TITLE> New Document </TITLE>  <META NAME="Generator" CONTENT="EditPlus">  <META NAME="Author" CONTENT="">  <META NAME="Keywords" CONTENT="">  <META NAME="Description" CONTENT="">  <script src="linked_js.js" type="text/javascript"/> <STYLE TYPE="text/css"> .style1{font-size:18px} </STYLE> <link rel="StyleSheet" href="http://povod.tut.by/css/reset.css" type="text/css" media="all" /> </HEAD> <BODY>  <A HREF="#anchor" onclick="function_name()" onmouseover="function_name()" onfocus="function_name()">Sanitize</A> <SPAN Style="font-weight:bold">tool</SPAN> <SPAN CLASS="style1">test</SPAN><A HREF="relative.html">again</A>  <strong></strong>  <SCRIPT LANGUAGE="JavaScript">  <!-- function function_name () { alert(1); }  //-->  </SCRIPT>  <APPLET CODE="" WIDTH="" HEIGHT="">  </APPLET> <OBJECT ID="BtHtmlList Class" WIDTH="" HEIGHT="" CLASSID="CLSID:3E7FDC60-80A8-4563-A562-0DC01ECF077C"> </OBJECT> <OBJECT ID="GomPlayerX Control" WIDTH="" HEIGHT="" CLASSID="CLSID:632CC9D6-5602-4854-AFD2-6EFC59177DE5"> </OBJECT> <OBJECT ID="Macromedia Flash Factory Object" WIDTH="" HEIGHT="" CLASSID="CLSID:D27CDB70-AE6D-11cf-96B8-444553540000"> </OBJECT> <embed src="fkash1.swf"></embed> </BODY> </HTML>
EOF
puts Sanitize.clean(bad_html, :elements => white_list_elements, :attributes => white_list_attributes)
{11-04-28 15:36}[jruby-1.5.6]fortress:~/Sandbox/ruby sf% ruby clean_bad_html.rb
  New Document           .style1{font-size:18px}  <a href="#anchor">Sanitize</a> tool test<a href="relative.html">again</a>  <strong></strong>    &lt;!-- function function_name () { alert(1); }  //--&gt;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment