Skip to content

Instantly share code, notes, and snippets.

@mjwillson
mjwillson / ann.rb
Last active December 29, 2015 16:09
ann -- ultra-basic console-based multiclass text annotation tool
#!/usr/bin/env ruby
require 'optparse'
OPTIONS = {}
PARSER = OptionParser.new do |opts|
opts.banner = "Usage: #{$0} [OPTIONS] INPUT_FILE [HOTKEY OUTPUT_FILE]..."
opts.separator(<<END
#{$0} -- ultra-basic console-based multiclass text annotation tool
@mjwillson
mjwillson / ngrams_via_striding.py
Last active August 1, 2017 04:19
Matrix of sliding window ngrams without any copying via numpy striding tricks
from numpy.lib.stride_tricks import as_strided
def ngrams_via_striding(array, order):
itemsize = array.itemsize
assert array.strides == (itemsize,)
return as_strided(array, (max(array.size + 1 - order, 0), order), (itemsize, itemsize))
In [71]: a = numpy.arange(10)
In [72]: ngrams_via_striding(a, 4)
Out[72]:
@mjwillson
mjwillson / iterable.py
Last active August 29, 2015 14:27
Decorate a generator function (or other iterator-returning function) as a multi-shot iterable. A fix for many Python gotchas relating to use of one-shot iterators
class iterable(object):
"""Decorates a generator function (or any other iterator-returning
function) as something which implements the iterable protocol and
can be safely passed to other code which may iterate over it
multiple times.
Usage:
@iterable
def foo():
@mjwillson
mjwillson / gist:675cc0259e4291d97104
Created September 7, 2015 17:55 — forked from benanne/gist:1759022
Theano AdvancedSubtensor memory leak
import theano.tensor as T
import theano
import numpy as np
import gc
def freemem():
gc.collect()
gc.collect()
gc.collect()
return theano.sandbox.cuda.cuda_ndarray.cuda_ndarray.mem_info()[0] / 1024**2