Skip to content

Instantly share code, notes, and snippets.

@igaiga
Created February 24, 2011 04:24
Show Gist options
  • Save igaiga/841749 to your computer and use it in GitHub Desktop.
Save igaiga/841749 to your computer and use it in GitHub Desktop.
ファイル中の日本語をだいたい検索
# -*- coding: utf-8 -*-
# 鬼車リファレンス
# http://www.geocities.jp/kosako3/oniguruma/doc/RE.ja.txt
require 'kconv'
Dir.glob('**/*.*'){ |f|
puts "■file: #{f}"
File.open(f, "r"){ |file_hundle|
contents = file_hundle.read
encoding = Kconv.guess(contents)
contents.force_encoding(encoding)
begin
contents.encode!('utf-8')
rescue
puts "#{f}:encoding=#{encoding}: error in encode!"
break
end
contents.lines.each_with_index do |l, i|
puts "#{f}:#{i+1}: #{l}" unless l =~ /^[\p{ASCII}]*$/
# puts "#{f}:#{i+1}: #{l}" if l =~ /\p{hiragana}|\p{katakana}|\p{han}/
# ↑全角記号が漏れる
end
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment