Skip to content

Instantly share code, notes, and snippets.

@mike-neck
Created February 29, 2012 02:15
Show Gist options
  • Select an option

  • Save mike-neck/1937011 to your computer and use it in GitHub Desktop.

Select an option

Save mike-neck/1937011 to your computer and use it in GitHub Desktop.
KuromojiというJavaでできた形態素解析器を使ってみた。
@GrabResolver (
name='ATILIKA dependencies',
root='http://www.atilika.org/nexus/content/repositories/atilika')
@Grab ('org.atilika.kuromoji:kuromoji:0.7.6')
import org.atilika.kuromoji.Token
import org.atilika.kuromoji.Tokenizer
import org.atilika.kuromoji.Tokenizer.Mode
def tokenizer = Tokenizer.builder().build()
def tokens = tokenizer.tokenize('すもももももももものうち。')
tokens.each {
println "======================${it}========================"
println "allFeatures : ${it.allFeatures}"
println "partOfSpeech : ${it.partOfSpeech}"
println "position : ${it.position}"
println "reading : ${it.reading}"
println "surfaceFrom : ${it.surfaceForm}"
println "allFeaturesArray : ${it.allFeaturesArray.join(', ')}"
println "known : ${it.known}"
println "unknown : ${it.unknown}"
println "user defined : ${it.user}"
}
======================org.atilika.kuromoji.Token@b317ad9========================
allFeatures : 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
partOfSpeech : 名詞,一般,*,*
position : 0
reading : スモモ
surfaceFrom : すもも
allFeaturesArray : 名詞, 一般, *, *, *, *, すもも, スモモ, スモモ
known : true
unknown : false
user defined : false
======================org.atilika.kuromoji.Token@5d78424c========================
allFeatures : 助詞,係助詞,*,*,*,*,も,モ,モ
partOfSpeech : 助詞,係助詞,*,*
position : 3
reading : モ
surfaceFrom : も
allFeaturesArray : 助詞, 係助詞, *, *, *, *, も, モ, モ
known : true
unknown : false
user defined : false
======================org.atilika.kuromoji.Token@248bb85========================
allFeatures : 名詞,一般,*,*,*,*,もも,モモ,モモ
partOfSpeech : 名詞,一般,*,*
position : 4
reading : モモ
surfaceFrom : もも
allFeaturesArray : 名詞, 一般, *, *, *, *, もも, モモ, モモ
known : true
unknown : false
user defined : false
======================org.atilika.kuromoji.Token@750f19ee========================
allFeatures : 助詞,係助詞,*,*,*,*,も,モ,モ
partOfSpeech : 助詞,係助詞,*,*
position : 6
reading : モ
surfaceFrom : も
allFeaturesArray : 助詞, 係助詞, *, *, *, *, も, モ, モ
known : true
unknown : false
user defined : false
======================org.atilika.kuromoji.Token@326f944c========================
allFeatures : 名詞,一般,*,*,*,*,もも,モモ,モモ
partOfSpeech : 名詞,一般,*,*
position : 7
reading : モモ
surfaceFrom : もも
allFeaturesArray : 名詞, 一般, *, *, *, *, もも, モモ, モモ
known : true
unknown : false
user defined : false
======================org.atilika.kuromoji.Token@3b712372========================
allFeatures : 助詞,連体化,*,*,*,*,の,ノ,ノ
partOfSpeech : 助詞,連体化,*,*
position : 9
reading : ノ
surfaceFrom : の
allFeaturesArray : 助詞, 連体化, *, *, *, *, の, ノ, ノ
known : true
unknown : false
user defined : false
======================org.atilika.kuromoji.Token@ce2fdb========================
allFeatures : 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
partOfSpeech : 名詞,非自立,副詞可能,*
position : 10
reading : ウチ
surfaceFrom : うち
allFeaturesArray : 名詞, 非自立, 副詞可能, *, *, *, うち, ウチ, ウチ
known : true
unknown : false
user defined : false
======================org.atilika.kuromoji.Token@60a7d346========================
allFeatures : 記号,句点,*,*,*,*,。,。,。
partOfSpeech : 記号,句点,*,*
position : 12
reading : 。
surfaceFrom : 。
allFeaturesArray : 記号, 句点, *, *, *, *, 。, 。, 。
known : true
unknown : false
user defined : false
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment