Skip to content

Instantly share code, notes, and snippets.

View nemupm's full-sized avatar

nemupm nemupm

View GitHub Profile
@nemupm
nemupm / make_csv_for_mecab.pl
Last active August 29, 2015 14:06
Make Wikipedia dictionary for MeCab - make_csv_for_mecab
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use encoding 'utf8';
use MeCab;
my $model = new MeCab::Model(join " ", @ARGV);
my $c = $model->createTagger();
@nemupm
nemupm / txt_regular_formatter.py
Last active August 29, 2015 14:06
Make Wikipedia dictionary for MeCab - txt_regular_formatter
#!/usr/bin/python
# -*- coding:utf-8 -*-
from unicodedata import normalize
def convert_str_to_regular_format(string):
uni = normalize('NFKC', string.decode('utf-8')).lower()
return uni.encode('utf-8')
f = open("jawiki-latest-all-titles-in-ns0","r")