Skip to content

Instantly share code, notes, and snippets.

@kaityo256
Created April 25, 2016 09:45
Show Gist options
  • Save kaityo256/dca90bfd165e4b1c9b5cc328099d27d8 to your computer and use it in GitHub Desktop.
Save kaityo256/dca90bfd165e4b1c9b5cc328099d27d8 to your computer and use it in GitHub Desktop.
HTMLの孤島発見スクリプト ref: http://qiita.com/kaityo256/items/320e31f1200f73706cea
$ cd public_html
$ ruby linkchecker.rb
/path/to/public_html/foo.html is isolated.
/path/to/public_html/bar.html is isolated.
# Copyright kaityo256 2016
# Distributed under the Boost Software License, Version 1.0.
# (See accompanying file LICENSE_1_0.txt or copy at
# http://www.boost.org/LICENSE_1_0.txt)
def findlink(filename,h)
dir = File.dirname(filename)
open(filename){|f|
f.gets(nil).scan(/a href=\"(.*?)\"/).each{|a|
s = a[0]
next if s =~ /^http/
next if s =~/^#/
if s=~/(.*?html)#/
s = $1
end
next if s !~/html$/
s = File.expand_path(s,dir)
h[s] = true
}
}
end
h = Hash.new
files = Array.new
Dir.glob('./**/*').each{|filename|
if filename =~ /html$/
full = File.expand_path(filename)
files.push full
findlink(full,h)
end
}
files.each{|f|
if !h.has_key?(f)
puts "#{f} is isolated."
end
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment