Skip to content

Instantly share code, notes, and snippets.

View odashi's full-sized avatar
🏠
Working from home

Yusuke Oda odashi

🏠
Working from home
View GitHub Profile
@odashi
odashi / gameoflife.cc
Last active June 6, 2017 21:27
Game of life on X11
#include <chrono>
#include <cstdlib>
#include <iostream>
#include <random>
#include <thread>
#include <X11/Xlib.h>
#include <X11/Xutil.h>
using namespace std;
@odashi
odashi / gameoflife.py
Last active June 1, 2017 15:53
The console game of life
#!/usr/bin/env python3
import curses
from random import random
from time import sleep
def main(scr):
curses.init_pair(1, curses.COLOR_BLACK, curses.COLOR_BLACK)
curses.init_pair(2, curses.COLOR_BLACK, curses.COLOR_BLACK)
curses.init_pair(3, curses.COLOR_WHITE, curses.COLOR_WHITE)
@odashi
odashi / apply_byte_pair_encoding.py
Last active August 15, 2017 12:23
Byte-pair encoding tools
#!/usr/bin/env python3
import sys
from argparse import ArgumentParser
from collections import defaultdict
def parse_args():
p = ArgumentParser('Converts word to integer using byte-pair encoding.')
p.add_argument(
'--input',
@odashi
odashi / generators.py
Created March 23, 2016 13:08
Frequently-used batch generators for my NLP study.
import builtins
import random
def word_list(filename):
with open(filename) as fp:
for l in fp:
yield l.split()
def batch(generator, size):
batch = []
@odashi
odashi / mert.py
Last active May 1, 2016 14:17
Minimum error-rate training for statistical machine translation
#!/usr/bin/python3
import math
import random
import sys
from argparse import ArgumentParser
from collections import defaultdict
from util.functions import trace
def parse_args():
@odashi
odashi / bleu.py
Last active September 20, 2019 06:46
BLEU calculator
# usage (single sentence):
# ref = ['This', 'is', 'a', 'pen', '.']
# hyp = ['There', 'is', 'a', 'pen', '.']
# stats = get_bleu_stats(ref, hyp)
# bleu = calculate_bleu(stats) # => 0.668740
#
# usage (multiple sentences):
# stats = defaultdict(int)
# for ref, hyp in zip(refs, hyps):
# for k, v in get_bleu_stats(ref, hyp).items():
@odashi
odashi / StanfordTokenizerRunner.java
Created January 26, 2016 20:22
Stanford Tokenizerを強制的に1行ずつ解析させるラッパ。パイプ通信用に使える。
import java.io.*;
import java.util.*;
import edu.stanford.nlp.ling.Word;
import edu.stanford.nlp.process.WordTokenFactory;
import edu.stanford.nlp.process.PTBTokenizer;
public class StanfordTokenizerRunner {
private static List<String> tokenize(String text) {
PTBTokenizer<Word> tokenizer = new PTBTokenizer<Word>(
@odashi
odashi / 1-to-many.txt
Last active September 11, 2015 09:57
日本語1文字から英語1単語への対応。EIJIROとWWWJDICを解析。
27 垓 100,000,000,000,000,000,000
22 京 10,000,000,000,000,000
18 根 stick-to-itiveness
18 闇 black-marketeering
17 兆 1,000,000,000,000
17 悦 self-satisfaction
16 訛 mispronunciation
16 識 acquaintanceship
16 矩 perpendicularity
16 鎌 sickle-and-chain
@odashi
odashi / chainer_encoder_decoder.py
Last active January 2, 2025 19:25
Training and generation processes for neural encoder-decoder machine translation.
#!/usr/bin/python3
import datetime
import sys
import math
import numpy as np
from argparse import ArgumentParser
from collections import defaultdict
from chainer import FunctionSet, Variable, functions, optimizers
@odashi
odashi / chainer_rnnlm.py
Created August 24, 2015 18:03
ChainerによるRNN言語モデルの学習器
#!/usr/bin/python3
# RNNLM trainer
# date: 2015-8-25
# author: @odashi_t
import datetime
import sys
import math
import numpy as np