This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
"""LNRE calculator. | |
This script computes a number of statistics characterizing LNRE data: | |
* N: corpus size | |
* V: vocabulary size | |
* V(1): the number of _hapax legomena_ (symbols occuring once) | |
* V(2): the number of _dis legomena_ (symbols occurring twice) | |
* V/N: vocabulary growth rate |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<epsilon> 0 | |
<SOH> 1 | |
<STX> 2 | |
<ETX> 3 | |
<EOT> 4 | |
<ENQ> 5 | |
<ACK> 6 | |
<BEL> 7 | |
<BS> 8 | |
<HT> 9 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
import fileinput | |
import nltk | |
if __name__ == "__main__": | |
for line in fileinput.input(): | |
print(line.rstrip().casefold()) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
import fileinput | |
import nltk | |
if __name__ == "__main__": | |
for line in fileinput.input(): | |
print(" ".join(nltk.word_tokenize(line))) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# What's the nearest word (in Levenshtein distance) to "covfefe"? | |
import string | |
# Available from: https://github.com/kylebgorman/EditTransducer | |
import edit_transducer | |
# You probably have this file if you're on Linux or Mac OS X. | |
with open("/usr/share/dict/words") as source: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
"""Checks that PyTorch can reach CUDA.""" | |
import sys | |
import torch | |
if __name__ == "__main__": | |
if not torch.cuda.is_available(): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""Log-odds computations.""" | |
from libc.math cimport log, sqrt | |
from libc.stdint cimport int64_t | |
ctypedef int64_t int64 | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""English function words. | |
Sets of English function words, based on | |
E.O. Selkirk. 1984. Phonology and syntax: The relationship between | |
sound and structure. Cambridge: MIT Press. (p. 352f.) | |
The categories are of my own creation. | |
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# | |
# Constructs resources for Zodiac cipher 408: | |
# | |
# * Plaintext and ciphertext FARs | |
# * Unweighted "key" FSTs and "channel" (hypothesis space) FSTs | |
# * A textual symbol table for plaintext and ciphertext | |
# | |
# Requires: Pynini and OpenFst with the FAR extension. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Computes relative error reduction given two percentages. | |
// | |
// This computes relative error reduction (RER) given two percentages, the | |
// "before" and "after" accuracy. | |
// | |
// This is given by: | |
// | |
// RER = 1 - (1 - new_accuracy) / (1 - old_accuracy) | |
// | |
// To compile: gcc -O3 -std=c99 -o rer rer.c |
NewerOlder