Jon Gauthier hans

PhD student at MIT Brain and Cognitive Sciences. Formerly @stanfordnlp, @openai, Google Brain. Psycholinguistics, NLP, ML. I open-source as much as possible.

839 followers · 28 following

Cambridge, MA
http://foldl.me

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

hans / viz_example.py

Last active May 18, 2019 18:04

visualize a BERT masked LM example drawn from a tfrecord

	def viz_example(ex):
	toks = [v[idx] for idx in ex.features.feature["input_ids"].int64_list.value if idx > 0]
	masked_words = [v[idx] for idx in ex.features.feature["masked_lm_ids"].int64_list.value if idx > 0]

	masked_dict = dict(zip(ex.features.feature["masked_lm_positions"].int64_list.value, masked_words))
	toks_out = ["%10s" % tok for tok in toks]
	mask_out = ["%10s" % masked_dict[i] if i in masked_dict else " " * 10 for i in range(len(toks))]

	for block_start in range(0, len(toks), 10):
	print(" ".join(toks_out[block_start:block_start+10]))

hans / analysis.ipynb

Created May 13, 2019 14:21

MEG preliminary analysis

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

hans / childes_questions.ipynb

Created April 4, 2019 18:19

CHILDES analysis: extract mother–child question–answer pairs from Brown corpus

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

hans / gist:825da8929a504a493d932479c78c5153

Last active January 3, 2019 05:50

	{
	"cells": [
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [],
	"source": [
	"%matplotlib inline\n",
	"import matplotlib.pyplot as plt\n",

hans / question_guided_visual_search.js

Created November 8, 2018 18:33

	// WebPPL code for a proof-of-concept model of
	// question-guided visual search.

	var images = [0, 1]
	// deterministic p(description \| image)
	var descriptions = [
	{blocked_intent: true, missing_object: false},
	{blocked_intent: false, missing_object: true}
	]

hans / fetch_utterances.py

Created October 16, 2018 22:15

Merge Hartshorne event annotations with CHILDES transcripts and do some initial analysis

	from argparse import ArgumentParser
	from collections import namedtuple
	from pathlib import Path
	import re

	import pandas as pd


	UTT_RE = re.compile(r"\([A-Z]+):\s(.+)\s*\x15(\d+)_(\d+)\x15$")
	TAG_RE = re.compile(r"([a-z:]+)\\|(\w+)")

hans / build_verb_argument_vocab.py

Last active August 13, 2018 18:06

MEG decoding study: Corpus search for candidate pairs of attested verb-noun combinations

	"""
	Calculate statistics on verb-argument pairings given a parsed corpus
	of CoNLL-U-formatted files. Part-of-speech tags and dependency heads+labels
	are required.
	"""

	from collections import Counter
	from pathlib import Path
	import re
	import sys

hans / fig.tex

Created June 28, 2018 16:38

fig.tex

	\begin{tikzpicture}
	\node (example) [draw,rectangle,align=center] at (0, 0) {Learning instance\\$w_1, \dots, w_N \to A$};
	\node (dec-learn) [draw,rectangle,align=center] at (8, -3) {\textbf{Learn:}\\Infer meanings for $w_1, \dots, w_N$};
	\node (lexicon) [draw,rectangle,align=center] at (0, -3) {Lexicon $\Lambda$\\$\text{raise} \to \texttt{S/N/PP} \lambda c\,x.\, c \land dir(e) = up \land move(e,x)$\\$\text{drop} \to \texttt{S/N/PP} \lambda c\,x.\, c \land dir(e) = down \land move(e,x)$\\$\text{onto} \to \texttt{PP} \lambda x.contact(x,e.patient)$\\$\ldots$};
	\node (dec-compress) [draw,rectangle,align=center] at (0, -6) {\textbf{Compress:}\\Abstract away regularities in the lexicon};
	\node (dec-bridge) [draw,rectangle,align=center] at (-8, -3) {\textbf{Bridge:}\\Derive new syntactic sub-categories};

	\begin{scope}[shift={(-6,-7.2)}]
	\Tree [.$\lambda$ [.$c$ ] [.$x$ ] [.$\land$ [.$c$ ] [.$=$ [.$dir(e)$ ] [.$up$ ] ] [.$move$ [.$e$ ] [.$x$ ] ] ] ]
	\begin{scope}[xshift=130pt]

hans / gambling_eig.js

Last active June 22, 2018 02:21

	var ps = Categorical({vs: [0.2, 0.4, 0.6, 0.8]})
	var ps_skew_left = Categorical({vs: [0.1, 0.2, 0.3, 0.8]})

	var vs = Categorical({vs: [2, 4, 6, 8]})
	var vs_skew_left = Categorical({vs: [1, 2, 3, 10]})

	var Model = function(name, f) {
	Object.defineProperty(f, 'name', {value: name})
	return f;
	};

hans / brain decoding results

Created May 27, 2018 02:28

	First stab at decoding P01 data onto InferSent embeddings (dim 4096).
	This is promising performance, since I haven't touched either the imaging data or the encodings since they were produced. (The regression is ~20,000 dim -> 4096 dim – definitely can improve with some preprocessing.)
	Compare with original results from the problem set later on in this gist.

	INFO:__main__:Loaded encodings of size (384, 4096).
	INFO:__main__:Loaded subject data/P01 data.
	INFO:__main__:Trained classifier for subject data/P01.
	Fold 0: min 4.0 mean 96.5 med 41.0 max 350.0
	Fold 1: min 0.0 mean 131.9 med 68.0 max 323.0
	Fold 2: min 2.0 mean 144.3 med 129.0 max 366.0

Newer Older