Skip to content

Instantly share code, notes, and snippets.

@komiya-atsushi
Created March 16, 2012 08:01
Show Gist options
  • Save komiya-atsushi/2049048 to your computer and use it in GitHub Desktop.
Save komiya-atsushi/2049048 to your computer and use it in GitHub Desktop.
MeCab の Ruby バインディングに薄い皮をかぶせたもの。Ruby らしく each でループしたいがためだけに作りました。
# -*- coding: utf-8 -*-
require 'MeCab'
class MeCabTagger
class MeCabNode
def initialize(node, charset)
@node = node
@feature = node.feature.force_encoding(charset).split(',')
@charset = charset
end
def pos
return @feature[0]
end
def pos_detail1
return @feature[1]
end
def pos_detail2
return @feature[2]
end
def pos_detail3
return @feature[3]
end
def basic_string
return @feature[6]
end
def surface
return @node.surface.force_encoding(@charset)
end
def word
if @feature[6] == "*"
return surface()
else
return @feature[6]
end
end
end
class MeCabResult
def initialize(head, charset)
@head = head
@charset = charset
end
def each
node = @head.next
while node.next != nil
yield MeCabNode.new(node, @charset)
node = node.next
end
end
end
def initialize(charset)
@tagger = MeCab::Tagger.new
@charset = charset
end
def parse(text)
return [] if text == nil || text.length == 0
return MeCabResult.new(@tagger.parseToNode(text), @charset)
end
end
if __FILE__ == $0
tagger = MeCabTagger.new('UTF-8')
tagger.parse('吾輩は猫である。名前はまだない。').each do |node|
puts "pos : #{node.pos}, word : #{node.word}"
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment