Skip to content

Instantly share code, notes, and snippets.

@buzztaiki
Created September 12, 2013 13:52
Show Gist options
  • Select an option

  • Save buzztaiki/6537748 to your computer and use it in GitHub Desktop.

Select an option

Save buzztaiki/6537748 to your computer and use it in GitHub Desktop.
lucene の analyzer を使って tokenize だけしてみるデモ(3.x用)
package com.arielnetworks.agn;
import java.io.StringReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.cjk.CJKAnalyzer;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.util.Version;
public class LuceneTokenizeDemo {
public static void main(String[] args) throws Exception {
try (Analyzer a = new CJKAnalyzer(Version.LUCENE_36)) {
TokenStream ts = a.tokenStream("body", new StringReader("システム第1事業部"));
while (ts.incrementToken()) {
CharTermAttribute cta = ts.getAttribute(CharTermAttribute.class);
System.out.println(cta);
}
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment