goodmami /
Created December 17, 2014 07:43
Ngrams, LMs, and Perplexity in AWK and sed
awk -v f="$ngram_count_file"\
'function log10(x){ return log(x)/log(10.0) }
while (getline < f) {
if(/[0-9]+\t[^ ]+$/) { type[1]++; token[1]=token[1]+$1 }
goodmami /
Last active April 8, 2024 16:14
Basic Arc Diagrams

Demonstration of an ArcDiagram layout function with both distance-based and compact level modes. The compact mode works better for orthogonal edges, and can use much less vertical space. The distance mode works better for curved arc edges.

The alice.json dataset is the first line from Lewis Carroll's Alice in Wonderland, parsed with the Stanford Parser.

goodmami /
Last active August 29, 2015 14:03
Using the send() function of a Python generator to approximate lookahead and other uses
def regenerator(gen):
for x in gen:
regen = (yield x)
while regen is not None:
yield None # send() also yields something, so don't pull from gen again
regen = (yield regen)
gen = (i for i in range(10))
regen = regenerator(gen)
v = next(regen) # 0
goodmami /
Last active January 4, 2016 21:18
A dictionary with a user-definable function for handling collisions.
class AccumulationDict(dict):
def __init__(self, accumulator, *args, **kwargs):
if not hasattr(accumulator, '__call__'):
raise TypeError('Accumulator must be a binary function.')
self.accumulator = accumulator
self.accumulate(*args, **kwargs)
def __additem__(self, key, value):
if key in self:
self[key] = self.accumulator(self[key], value)
goodmami / logging.bash
Last active May 6, 2024 18:24
Basic logging commands for Linux shell scripts
## Simple logging mechanism for Bash
## Author: Michael Wayne Goodman <[email protected]>
## Thanks: Jul for the idea to add a datestring. See:
## Thanks: @gffhcks for noting that inf() and debug() should be swapped,
## and that critical() used $2 instead of $1