Skip to content

Instantly share code, notes, and snippets.

@namutaka
Created April 23, 2011 17:12
Show Gist options
  • Save namutaka/938786 to your computer and use it in GitHub Desktop.
Save namutaka/938786 to your computer and use it in GitHub Desktop.
List up all tags in Html.
#!/usr/bin/perl
sub all_tags {
my($str) = @_;
my @ret = ( );
while ($str =~ /<\w+[^>]*\/>|<(\S+)[^>]*[^\/]?>(.*?)<\/\1>/gs) {
push(@ret, $&);
push(@ret, &all_tags($2));
}
return @ret;
}
# sample work
if ($0 == __FILE__) {
$html = <<EOF ;
<html xmlns="http://www.w3.org/TR/xhtml1" xml:lang="ja" lang="ja">
<head>
<meta http-equiv="content-type" content="text/html; charset=euc-jp" />
<meta http-equiv="content-style-type" content="text/css" />
<link rel="index" href="http://example.com/" />
<link rel="stylesheet" href="style.css" type="text/css" />
<title>
Page Title
</title>
</head>
<body class="body">
<a href="index.html">Top</a>
<h1>
Head 1
</h1>
</body>
</html>
EOF
print "*** All Tags \n";
foreach my $val (&all_tags($html)) {
$val =~ s/\n//g;
print "$val \n";
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment