Skip to content

Instantly share code, notes, and snippets.

View tokestermw's full-sized avatar

Motoki Wu tokestermw

View GitHub Profile
@tokestermw
tokestermw / tf_ed_vi_tutorial.py
Last active July 19, 2019 01:18
Variational inference and Bayesian deep learning tutorial (w/ uncertainty intervals) using TensorFlow and Edward.
""" Some description.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import sys
import json
import tqdm
@tokestermw
tokestermw / pdftotext_w_japanese.sh
Last active May 7, 2018 09:26
Make pdftotext compatible with Japanese text on Mac OS.
# -- set up repos
brew install Caskroom/cask/xquartz
# -- install xpdf
brew install xpdf
# -- download japanese package
wget ftp://ftp.foolabs.com/pub/xpdf/xpdf-japanese.tar.gz
# -- open
@tokestermw
tokestermw / tf_dataset_api_text.py
Last active December 18, 2018 06:32
Using the new Dataset API from TensorFlow 1.2.0, return padded and batched tensors from text data where each line is a sentence.
import numpy as np
import tensorflow as tf
_major_version, _minor_version, _ = map(int, tf.__version__.split('-')[0].split('.'))
assert _major_version >= 1 and _minor_version >= 2, "requires TensorFlow 1.2.0 and above"
text_data_path = "./z_sentences.txt"
MAX_SEQUENCE_LENGTH = 10
@tokestermw
tokestermw / self_attention.py
Last active March 3, 2025 11:36
Implementation of self-attention in the paper "Attention Is All You Need" in TensorFlow.
"""Example TensorFlow code for Self-Attention mechanism.
Refs:
Attention Is All You Need
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
https://arxiv.org/abs/1706.03762
Transformer: A Novel Neural Network Architecture for Language Understanding
https://research.googleblog.com/2017/08/transformer-novel-neural-network.html
@tokestermw
tokestermw / beam_search.py
Created November 13, 2017 23:31
Simple attempt at beam search.
import numpy as np
import heapq
VOCAB_SIZE = 1000
HIDDEN_DIM = 128
vocab = {
'the': 5,
'fox': 35,
'jumped': 144,
@tokestermw
tokestermw / cantor_set.py
Created November 28, 2017 00:32
Implementation of Cantor set explained here: http://natureofcode.com/book/chapter-8-fractals/
from copy import deepcopy
class Line:
def __init__(self, length: int, x: int):
self.length = length
self.x = x
def __len__(self):
return self.length
@tokestermw
tokestermw / machine_learned_index.py
Last active April 9, 2019 08:41
Using deep learning to approximate a B-Tree index from this paper: https://arxiv.org/abs/1712.01208 (The Case for Learned Index Structures)
import click
import torch
import torch.autograd
import torch.nn.functional as F
from torch.autograd import Variable
import os
import random
import math
import random
def process_line(line):
columns = line.split('\t')
if len(columns) < 6:
return None
n_corrections = columns[0]
serial_number = columns[1]
url = columns[2]
@tokestermw
tokestermw / birnnlm_pytorch.py
Last active May 30, 2020 08:29
Simple example of Bidirectional RNN Language Model in PyTorch. (blog post: https://medium.com/@plusepsilon/the-bidirectional-language-model-1f3961d1fb27)
import torch, torch.nn as nn
from torch.autograd import Variable
text = ['BOS', 'How', 'are', 'you', 'EOS']
seq_len = len(text)
batch_size = 1
embedding_size = 1
hidden_size = 1
output_size = 1