This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Z_r (or "averaging") transform functions, based on: | |
# | |
# Kenneth W. Church and William A. Gale. 1991. A comparison of the enhanced | |
# Good-Turing and deleted estimation methods for estimating probabilities of | |
# English bigrams. Computer Speech and Language 5(1):19--54 | |
# | |
# Kyle Gorman <[email protected]> | |
# | |
# Church and Gale do not say what is to be done about points at the edges. I | |
# have chosen to average them with respect to only the inward facing frequency, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# difflib_demo.py | |
# Kyle Gorman <[email protected]> | |
from difflib import SequenceMatcher | |
if __name__ == '__main__': | |
from sys import argv | |
for file in argv[1:]: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# wagnerfischer.py: Dynamic programming Levensthein distance function | |
# Kyle Gorman <[email protected]> | |
# | |
# Based on: | |
# | |
# Robert A. Wagner and Michael J. Fischer (1974). The string-to-string | |
# correction problem. Journal of the ACM 21(1):168-173. | |
# | |
# The thresholding function was inspired by BSD-licensed code from |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# ProbDist.py: Two classes for probability distributions and sampling. | |
# Kyle Gorman <[email protected]> | |
from math import fsum | |
from bisect import bisect | |
from random import random | |
from collections import defaultdict | |
class MLProbDist(object): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# Knuth-Morris-Pratt demonstration | |
# Kyle Gorman <[email protected]> | |
# | |
# A naive Python implementation of a function that returns the (first) index of | |
# a sequence in a supersequence is the following: | |
def subsequence(needle, haystack): | |
""" | |
Naive subsequence indexer; None if not found |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File type = "ooTextFile" | |
Object class = "TextGrid" | |
xmin = 0 | |
xmax = 3 | |
tiers? <exists> | |
size = 1 | |
item []: | |
item [1]: | |
class = "IntervalTier" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
* Tolerance Principle calculator, based on: | |
* | |
* C. Yang (2005). On productivity. Language Variation Yearbook 5:333-370. | |
* | |
* Definition: | |
* | |
* The number of data points consistent with a rule R is given by N, and the | |
* number of exceptions to it by m. By Tolerance, R is productive iff: | |
* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# point_bisect.py | |
# Kyle Gorman | |
# | |
# I continually use these two patterns in Python for iterables that contain | |
# continuous values, sorted. Here they are in their full glory. | |
from bisect import bisect_left | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* Copyright (c) 2012 Kyle Gorman | |
* | |
* Permission is hereby granted, free of charge, to any person obtaining a copy | |
* of this software and associated documentation files (the "Software"), to | |
* deal in the Software without restriction, including without limitation the | |
* rights to use, copy, modify, merge, publish, distribute, sublicense, and/or | |
* sell copies of the Software, and to permit persons to whom the Software is | |
* furnished to do so, subject to the following conditions: | |
* | |
* The above copyright notice and this permission notice shall be included in |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* Copyright (c) 2012 Kyle Gorman | |
* | |
* Permission is hereby granted, free of charge, to any person obtaining a copy | |
* of this software and associated documentation files (the "Software"), to | |
* deal in the Software without restriction, including without limitation the | |
* rights to use, copy, modify, merge, publish, distribute, sublicense, and/or | |
* sell copies of the Software, and to permit persons to whom the Software is | |
* furnished to do so, subject to the following conditions: | |
* | |
* The above copyright notice and this permission notice shall be included in |
OlderNewer