Skip to content

Instantly share code, notes, and snippets.

View bartvm's full-sized avatar

Bart van Merriënboer bartvm

View GitHub Profile
# Reverse unpacking nightmares
def f(a, b, c=3, *d, **e):
print(a, b, c, d, e)
f(*(1, 2), **{'c': 3, 'foo': 'bar'})
f(1, 2, 3)
f(1, 2, 3, 4)
f(*(1, 2, 3, 4))
f(1, 2, **{'foo': 'bar'})
# Goal: Valid, clean (DNRY!) Python with as few flake8 errors as possible, while
# preferring explicit statements over naming conventions
# 1. Function arguments are replaced with subtrees
# 1.1 If the primal returns a value, we annotate it with a name which is then
# used in the adjoint function signature
# 1.2 If a function has a return value, its adjoint receives exactly one more
# function argument, which is assumed to be the gradient with respect to
# the output
# 1.3 The names of the primal and adjoint must match (else we don't know
from ast import parse, NodeTransformer
from inspect import getsource
from textwrap import dedent
from collections import namedtuple
# We fill in the templates in two ways:
# 1. We insert strings
# 2. We insert a placeholder variable name, which is parsed into the AST before
# being replaced by a subtree
import re
import sys
from bitarray import bitarray
from daryq import daryq
from collections import Counter
with open(sys.argv[1]) as f:
text = f.read()
#!/bin/env bash
# Usage: msub-batch.sh -l walltime=73:00:00 [-w 43200] -- -l nodes=1:gpus=2
# Automatically submits a series of jobs that are dependent on each other.
# By default it submits jobs of 24 hours, but a different interval can be
# given using the -w flag (in seconds). Each job can look at the MOAB_JOBID
# and MOAB_DEPEND variables to see what its own ID are and whether it has
# dependencies
MAX_RUNTIME=$((24 * 60 * 60))
import os
import weakref
import logging
import atexit
import threading
import queue
import itertools
_threads_queues = weakref.WeakKeyDictionary()
_shutdown = False

Myia (Theano 2.0)

Automatic differentiation vs. symbolic

In the literature the term automatic differentiation (AD) is reserved for a specific technique in which the gradient of a program is calculated by creating an adjoint program, which performs the gradient calculation of the given program. Note that this adjoint program includes all control flow statements. There are two approaches to implementing AD: Source code transformation (SCT) and operator overloading (OO). With source code transformation we generate the adjoint program in the host language e.g. given a Python function we manipulate the abstract syntax tree (AST) directly in order to create a new Python function which performs the gradient computation. Operator overloading on the other hand overloads each operator to add an entry to a tape (Wengert list). Once the function exits, the gradient is calculated by going through the tape in reverse order, applying the gradient operators.

Theano does not employ AD but "[a highly optimized form of

#include <cstdio>
#include <thread>
#include "tbb/flow_graph.h"
#include "TH.h"
#include <array>
#include <tuple>
// This is an op that gets added to the TBB graph
class Cadd {
public:
local nn = require 'nn'
Flatten, parent = torch.class('nn.Flatten', 'nn.Container')
function Flatten:__init(flattenDims)
-- This container wraps a module and flattens the first N dimensions before
-- unflattening them again (in this case time and batches)
self.flattenDims = flattenDims
parent.__init(self)
end
#define LUA_LIB
#include "lua.h"
#include "lauxlib.h"
static int createtable_create(lua_State *L) {
lua_createtable(L, luaL_optstring(L, 1, 0), luaL_optstring(L, 2, 0));
return 1;
}
static const luaL_Reg createtablelib[] = {