Skip to content

Instantly share code, notes, and snippets.

@tomonari-masada
Last active April 11, 2017 03:21
Show Gist options
  • Save tomonari-masada/f488038f72d1b1e06119b234a0e171a8 to your computer and use it in GitHub Desktop.
Save tomonari-masada/f488038f72d1b1e06119b234a0e171a8 to your computer and use it in GitHub Desktop.
How to use MeCab in Python3
import sys
import io
import MeCab
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf8')
m = MeCab.Tagger()
fp = open('a1.txt', encoding='utf8')
wseq = list()
for line in fp:
for ol in m.parse(line.strip()).split('\n'):
if len(ol.split()) > 1:
wseq.append(ol.split()[1].split(',')[6])
wdic = dict()
for word in wseq:
wdic[word] = wdic.get(word, 0) + 1
print(wdic)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment