Skip to content

Instantly share code, notes, and snippets.

View gabrielhuang's full-sized avatar

Gabriel Huang gabrielhuang

View GitHub Profile
@gabrielhuang
gabrielhuang / toy_transformer.py
Created June 19, 2025 21:02
toy transformer implementing a character-level MQA/MHA with Rope embeddings
import torch
from torch import nn
import torch.nn.functional as F
from torch.nn.modules import transformer
embedding_size = 16
vocab = "abcdefghijklmnopqrstuvwxyz .!?'\"\n$" # $ is also EOS
vocab_size = len(vocab)
@gabrielhuang
gabrielhuang / grobid-dockerfile
Created March 8, 2023 18:27
GROBID dockerfile
FROM grobid/grobid:0.7.2
RUN apt-get update && \
apt-get install openjdk-8-jdk -y --no-install-recommends && \
apt-get autoremove -y --purge && \
apt-get clean -y && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
rm -f /etc/legal /etc/motd
WORKDIR /
@gabrielhuang
gabrielhuang / count_haiku_syllables.py
Last active October 28, 2022 02:58
Count haiku syllables using cmudict and fallback to syllables library
import cmudict # for syllables
import syllables
import re
whitespace = re.compile(r'[\s,.?!/=();]+')
cmudict_cached = cmudict.dict()
def lookup_word(word_s):
import os, sys
sys.path.append(os.getcwd())
import time
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
import sklearn.datasets