OKA Naoya pn11

GIZA++ の使い方

GIZA++ は、統計的機械翻訳に使われるアライメントツールで、 IBM Model 1-5 と HMM を実装しています。今回は、Europarl Parallel Corpus で配布されている英独対訳コーパスのアライメントの尤度を推定させてみます。

# GIZA++ の準備
$ git clone https://github.com/moses-smt/giza-pp.git
$ cd giza-pp

$ make

	# In advance, tabs.json have to be extracted via ADB by following way. (See https://android.stackexchange.com/a/199496/340082 for detail.)
	# adb forward tcp:9222 localabstract:chrome_devtools_remote
	# wget -O tabs.json http://localhost:9222/json/list
	import json

	with open('tabs.json') as f:
	tabs = json.load(f)

	with open('tabs.md', 'w') as f:
	f.write(f"# {len(tabs)} tabs in your Android Chrome\n\n")

	#!/bin/bash

	# For Mac (Homebrew), use aliases below
	#alias find=gfind
	#alias sed=gsed

	function count_files () {
	num_files=$(gfind "$1" -maxdepth 1 -type f \| wc -l)
	echo "$1 ${num_files}"

	#!/usr/bin/env python
	# -- coding: utf-8 --
	"""
	フレームワークとして Flask(http://flask.pocoo.org/) を、OAuth ライブラリとして oauth2(http://pypi.python.org/pypi/oauth2/) を利用したサンプルプログラムです。
	下のコードを保存して (oauth_consumer.py とします)、YOUR_CONSUMER_KEY, YOUR_CONSUMER_SECRET となっている部分を自分の consumer_key, consumer_secret で置き換えます。（settings.pyに保存してください）
	$ python oauth_consumer.py
	... で起動してから http://localhost:5000 に Web ブラウザでアクセスして下さい。

	+ 2015/10/25 python3用に書き換えました。
	"""

	import requests
	import time
	from tqdm import tqdm

	base_url = 'http://xxxxx.xxx/{image_id}.jpg'

	def get_image(image_id):
	r = requests.get(base_url.format(image_id=image_id))
	with open(f"{image_id}.jpg", 'wb') as f:
	f.write(r.content)