Skip to content

Instantly share code, notes, and snippets.

View kylebgorman's full-sized avatar

Kyle Gorman kylebgorman

View GitHub Profile
@ctw
ctw / wagnerfischerpp.py
Last active February 8, 2016 17:44 — forked from kylebgorman/wagnerfischerpp.py
Wagner-Fischer Levenshtein distance, now with a means to generate all possible optimal alignments.
#!/usr/bin/env python
#
# Copyright (c) 2013-2014 Kyle Gorman
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the
# "Software"), to deal in the Software without restriction, including
# without limitation the rights to use, copy, modify, merge, publish,
# distribute, sublicense, and/or sell copies of the Software, and to
# permit persons to whom the Software is furnished to do so, subject to
@kylebgorman
kylebgorman / WALS131A.R
Last active January 22, 2023 16:27
Tests the hypothesis that vigesimal (base-20) number systems are more common at tropical latitudes
#!/usr/bin/env Rscript
# WALS131A.R
# Kyle Gorman <[email protected]>
#
# Tests the hypothesis that vigesimal (base-20) number systems are more common
# at tropical latitudes. Thanks to Richard Sproat for suggesting this
# hypothesis.
#
# The data is read directly from WALS (#131A):
#
@kylebgorman
kylebgorman / rer.c
Last active July 13, 2021 12:29
Relative error reduction calculation
// Computes relative error reduction given two percentages.
//
// This computes relative error reduction (RER) given two percentages, the
// "before" and "after" accuracy.
//
// This is given by:
//
// RER = 1 - (1 - new_accuracy) / (1 - old_accuracy)
//
// To compile: gcc -O3 -std=c99 -o rer rer.c
@kylebgorman
kylebgorman / z408.py
Last active May 25, 2021 14:46
Zodiac cipher 408: freestanding Python 3 script for converting the plaintext and ciphertext to OpenFst assets
#!/usr/bin/env python
#
# Constructs resources for Zodiac cipher 408:
#
# * Plaintext and ciphertext FARs
# * Unweighted "key" FSTs and "channel" (hypothesis space) FSTs
# * A textual symbol table for plaintext and ciphertext
#
# Requires: Pynini and OpenFst with the FAR extension.
@kylebgorman
kylebgorman / function_words.py
Created June 22, 2018 18:57
Function words
"""English function words.
Sets of English function words, based on
E.O. Selkirk. 1984. Phonology and syntax: The relationship between
sound and structure. Cambridge: MIT Press. (p. 352f.)
The categories are of my own creation.
"""
@kylebgorman
kylebgorman / log_odds.pyx
Last active February 6, 2024 19:49
Log-odds calculations
"""Log-odds computations."""
from libc.math cimport log, sqrt
from libc.stdint cimport int64_t
ctypedef int64_t int64
@kylebgorman
kylebgorman / torch_cuda.py
Last active October 8, 2019 15:58
Checks that PyTorch can reach CUDA
#!/usr/bin/env python
"""Checks that PyTorch can reach CUDA."""
import sys
import torch
if __name__ == "__main__":
if not torch.cuda.is_available():
@kylebgorman
kylebgorman / covfefe.py
Created June 8, 2019 19:11
Which English word is most similar to "covfefe"?
#!/usr/bin/env python
# What's the nearest word (in Levenshtein distance) to "covfefe"?
import string
# Available from: https://github.com/kylebgorman/EditTransducer
import edit_transducer
# You probably have this file if you're on Linux or Mac OS X.
with open("/usr/share/dict/words") as source:
@kylebgorman
kylebgorman / word_tokenize.py
Last active June 18, 2023 05:35
Applies NLTK PTB tokenizer to input text
#!/usr/bin/env python
import fileinput
import nltk
if __name__ == "__main__":
for line in fileinput.input():
print(" ".join(nltk.word_tokenize(line)))
@kylebgorman
kylebgorman / casefold.py
Created July 10, 2019 12:15
Applies Unicode case folding to input data
#!/usr/bin/env python
import fileinput
import nltk
if __name__ == "__main__":
for line in fileinput.input():
print(line.rstrip().casefold())