Brendan O'Connor brendano

In Thomas Bass's The Predictors,

In one scene they are talking to a potential investor who kept wanting to talk about their earlier complexity theory work,

Marrin wanted chaos and fractals, and we were offering engineering and statistics.

I remember reading that and thinking, wait, but isn't that why I'm reading this book, and why the book is supposed to be interesting? I stopped reading the book at some point after that.

	13150 samples ~ 0.4% se, though more in reality
	File Function Line
	50.3 % treetm.jl cgsIterPath 383 qnewLL,pnewLL = proposePath!(newpath, mm, V,di,word, first ? nothing : oldpath, :simulate)
	44.1 % treetm.jl cgsIterPath 384 qoldLL,poldLL = proposePath!(oldpath, mm, V,di,word, first ? nothing : oldpath, :evaluate)
	4.1 % treetm.jl getindex 79 getindex(c::ClassCountTable, k) = c.counts[k]
	2.1 % treetm.jl incrementFullpath! 122 x = b ? x.right : x.left
	22.0 % treetm.jl proposePath! 337 w0 = (n0.counts[wordID] + betaHere/V - on0) / (n0.counts.total + betaHere - int(on_cur))
	18.9 % treetm.jl proposePath! 338 w1 = (n1.counts[wordID] + betaHere/V - on1) / (n1.counts.total + betaHere - int(on_cur))
	5.2 % treetm.jl proposePath! 339 p0 = (cur_docnode.left.count + mm.gammaConc/2 - on0)
	6.5 % treetm.jl proposePath! 340 p1 = (cur_do

	File Function Line
	48.0 % /Users/brendano/Desktop/hier_lda/code/treetm.jl cgsIterPath 389 qnewLL,pnewLL = proposePath!(newpath, mm, V,di,word, first ? nothing : oldpath, :simulate)
	41.6 % /Users/brendano/Desktop/hier_lda/code/treetm.jl cgsIterPath 390 qoldLL,poldLL = proposePath!(oldpath, mm, V,di,word, first ? nothing : oldpath, :evaluate)
	1.1 % /Users/brendano/Desktop/hier_lda/code/treetm.jl cgsIterPath 391 logA = pnewLL-poldLL + qoldLL-qnewLL # (pnew-qnew) - (pold-qnew)
	3.0 % /Users/brendano/Desktop/hier_lda/code/treetm.jl cgsIterPath 402 incrementFullpath!(mm.cTopicWord, newpath, word, +1)
	3.1 % /Users/brendano/Desktop/hier_lda/code/treetm.jl cgsIterPath 408 incrementFullpath!(mm.cTopicWord, oldpath, word, -1)
	2.7 % /Users/brendano/Desktop/hier_lda/code/treetm.jl getindex 82 getindex(c::ClassCountTable, k) = c.c

	julinclude("gotree.jl")
	Array{CountTrie,1}
	accept rate = 850654/850654 = 1.000
	elapsed time: 33.986743839 seconds (5921922372 bytes allocated)
	.ITER 1
	accept rate = 809118/850654 = 0.951
	elapsed time: 36.449559574 seconds (6175698392 bytes allocated)
	.ITER 2
	accept rate = 796254/850654 = 0.936
	elapsed time: 30.21721326 seconds (6166426280 bytes allocated)

	# http://mikelove.wordpress.com/2013/11/07/empirical-bayes/

	# Stein's estimation rule and its competitors - an empirical Bayes approach
	# B Efron, C Morris, Journal of the American Statistical, 1973
	n <- 1000
	sigma.means <- 5
	means <- rnorm(n, 0, sigma.means)
	# sigma.y <- 5
	library(manipulate)
	manipulate({

	package nlp;

	import java.io.IOException;
	import java.io.StringReader;

	import edu.stanford.nlp.io.IOUtils;
	import edu.stanford.nlp.ling.CoreAnnotations;
	import edu.stanford.nlp.ling.CoreLabel;
	import edu.stanford.nlp.trees.LabeledScoredTreeFactory;
	import edu.stanford.nlp.trees.PennTreeReader;

	This is a edu.stanford.nlp.trees.Tree

	tree.setSpans(); // these are 0-indexed, inclusive-inclusive
	tree.indexSpans(); // yup, this saves stuff to a different place. apparently 0-indexed inclusive-exclusive
	tree.indexLeaves(); // these are 1-indexed (!!) stanfordnlp coref code heavily uses them

	===
	[Update July 25... and after https://gist.github.com/leondz/6082658 ]

	OK never mind the questions about cross-validation versus a smaller eval split and all that.

	We evaluated our tagger (current release, version 0.3.2),
	trained and evaluated on the same splits as the GATE tagger
	(from http://gate.ac.uk/wiki/twitter-postagger.html and specifically twitie-tagger.zip)
	and it gets 90.4% accuracy (significantly different than the GATE results).

	"""
	Wrapper around morpha from
	http://www.informatics.sussex.ac.uk/research/groups/nlp/carroll/morph.html

	Vaguely follows edu.stanford.nlp.Morphology except we implement with a pipe.
	hacky. Would be nice to use cython/swig/ctypes to directly embed morpha.yy.c
	as a python extension.

	TODO compare linguistic quality to lemmatizer in python's "pattern" package

	mcmc convergence diagnostics
	https://github.com/brendano/conplot

	~/myutil % grep totalLL log\|awk '{print $2}' \| conplot
	-2.87e+06 o
	oooo
	o oooooooooooooooooooooooooooooooooooooo

	oooo
	-2.93e+06 ooooo