Skip to content

Instantly share code, notes, and snippets.

View ikegami-yukino's full-sized avatar

IKEGAMI Yukino ikegami-yukino

View GitHub Profile
@ikegami-yukino
ikegami-yukino / unidic_yomi.py
Last active January 24, 2017 05:11
UniDicからアルファベット単語と読みのペアを抽出
import re
import os
import glob
re_pair = re.compile('^([ァ-ンー]+)\-([a-zA-Z \'\-\(\)]+)')
UNIDIC_PATH = 'path to UniDic directory'
with open('result.tsv', 'w') as out_fd:
for csvfile in glob.glob(os.path.join(UNIDIC_PATH, '*.csv')):
with open(csvfile) as dic_fd:
@ikegami-yukino
ikegami-yukino / steps.sh
Last active April 30, 2020 04:24 — forked from albertstartup/steps.sh
aws gpu, ubuntu 16.04, nvidia driver 367, cuda 8,
# Required download
# cudnn-8.0-linux-x64-v5.1.tgz
curl -L -o cuda_8.0.44_linux.run https://developer.nvidia.com/compute/cuda/8.0/prod/local_installers/cuda_8.0.44_linux-run
curl -L -O http://us.download.nvidia.com/XFree86/Linux-x86_64/367.27/NVIDIA-Linux-x86_64-367.27.run
sudo apt-get install build-essential
sudo apt-get install linux-image-extra-`uname -r`
sudo sh cuda_8.0.44_linux.run
echo -e "export CUDA_HOME=/usr/local/cuda\nexport PATH=\$PATH:\$CUDA_HOME/bin\nexport LD_LIBRARY_PATH=\$LD_LINKER_PATH:\$CUDA_HOME/lib64" >> ~/.bashrc
@ikegami-yukino
ikegami-yukino / mecab-skkserv.sh
Last active March 21, 2017 17:02
Installing mecab-skkserv on macOS Sierra
wget http://www.chasen.org/~taku/software/mecab-skkserv/mecab-skkserv-0.03.tar.gz
tar xzf mecab-skkserv-0.03.tar.gz
cd mecab-skkserv-0.03
ls *|xargs nkf -w --overwrite
./configure --with-charset=utf8
echo 'cost-factor = 700' >>dicrc
perl -i -ne '$i++; print if ($i != 36 && $i != 37 && $i != 38 && $i != 44 && $i != 45 && $i != 46 && $i != 47 && $i != 48)' mecab-skkserv.cpp
make
make install
@ikegami-yukino
ikegami-yukino / install_sentencepiece_on_mac.sh
Last active July 5, 2018 04:21
Install Sentencepiece on mac OS
brew install autoconf automake libtool protobuf
pushd .
git clone --depth=1 https://github.com/google/sentencepiece.git /tmp/
cd /tmp/sentencepiece
perl -i -pe 's/libtoolize/glibtoolize/' autogen.sh
./autogen.sh
./configure
make
make check
@ikegami-yukino
ikegami-yukino / julius.py
Created November 18, 2017 07:37
Python script of Julius on Windows
# coding:utf-8
import re
import socket
import subprocess
import time
HOST = "127.0.0.1"
PORT = 10500
JULIUS_DIR = 'C:\Program Files (x86)\julius-4.4.2-win32bin\\'
@ikegami-yukino
ikegami-yukino / pixiv_novel.py
Created December 17, 2017 01:48
Pixiv小説のクロール
# -*- coding: utf-8 -*-
import re
from robobrowser import RoboBrowser
PIXIV_BASE_URL = 'https://www.pixiv.net'
TAG = '巴マミ'
MAX_PAGE = 190
browser = RoboBrowser(parser='lxml', history=True)
browser.open('https://accounts.pixiv.net/login')
def quicksort(x):
if not x:
return []
pivot = x[0]
smaller = quicksort([a for a in x[1:] if a <= pivot])
bigger = quicksort([a for a in x[1:] if a > pivot])
return(smaller + [pivot] + bigger)
def mergesort(l):
if len(l) > 1:
mid = len(l) // 2
left = l[:mid]
right = l[mid:]
left = mergesort(left)
right = mergesort(right)
i = 0
@ikegami-yukino
ikegami-yukino / parse_csj.py
Last active November 22, 2018 04:58
Convert CSJ's xml to plain text
import glob
import html
import re
import sys
import jaconv
re_ogt = re.compile(' OrthographicTranscription="([^"]+)"')
re_a = re.compile('\;([^\)]+)\)?')
re_semicolon = re.compile(';([^\)]+)\)?')
@ikegami-yukino
ikegami-yukino / jabstract.py
Created December 14, 2018 17:20 — forked from nakagami/jabstract.py
Japanese summarization module using LexRank algorithm.
#!/usr/bin/env python
# The MIT License (MIT)
# Copyright © 2015 Recruit Technologies Co.,Ltd.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions: