David Marx dmarx

Engineer / Machine Learning Researcher interested in deep learning, probabilistic ML, generative models, multi-modal SSL, visual understanding, geometric

558 followers · 371 following

CoreWeave, EleutherAI
Seattle, WA
http://dmarx.github.io
@digthatdata.bsky.social
@DigThatData

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

dmarx / find_best_cutoff.r

Created August 9, 2017 12:11

Demonstration of how to construct a bespoke regression to determine the optimal cutoff value for constructing a categorical variable for a logistic regression

	# Finding best cut-off for constructing a categorical variable
	# logistic regression

	data(iris)

	x0 = iris[iris$Species != 'setosa',]
	plot(x0, col=x0$Species)

	# Keep things simple for this demo
	form = "is_virginica ~ Petal.Length + Petal.Width"

dmarx / mcglm.r

Last active July 3, 2017 19:27

Playing with `mcglm` for a multivariate poisson model. Not sure how to extract the inter-DV covariance.

	# From the docs for mcglm::ahs

	require(mcglm)
	data(ahs, package="mcglm")
	form1 <- Ndoc ~ income + age
	form2 <- Nndoc ~ income + age

	Z0 <- mc_id(ahs)
	fit.ahs <- mcglm(linear_pred = c(form1, form2),
	matrix_pred = list(Z0, Z0), link = c("log","log"),

dmarx / demo.r

Last active June 28, 2017 23:37

Demonstration of a method for evaluating the performance of a poisson regression by calculating the bootstrapped accuracy subject to a range of error thresholds

	##########################################################
	# Get data from the poisson demo at: #
	# https://stats.idre.ucla.edu/r/dae/poisson-regression/ #
	##########################################################

	p <- read.csv("https://stats.idre.ucla.edu/stat/data/poisson_sim.csv")
	p <- within(p, {
	prog <- factor(prog, levels=1:3, labels=c("General", "Academic",
	"Vocational"))
	id <- factor(id)

dmarx / Makefile

Created June 21, 2017 20:31

Minimal working example for stackoverlfow question

	DIRS := $(filter dir%, $(shell ls))

	foo_sources := $(wildcard */source/foo.a)
	foo_targets_prt := $(patsubst %.a, %.b, $(foo_sources))
	foo_targets := $(subst source,target, $(foo_targets_prt))

	bar_sources := $(wildcard */source/bar.a)
	bar_x := $(patsubst %/bar.a, %/Y.a, $(bar_sources))
	bar_y := $(patsubst %/bar.a, %/Z.a, $(bar_sources))
	bar_targets := $(bar_x) $(bar_y)

dmarx / simple regression to measure effect of a regime change.r

Last active June 6, 2017 01:12

Simple regression with interaction terms to measure effect of a regime change on the predictors. Implementation of https://stats.stackexchange.com/a/99432/8451

	#' ---
	#' title: "Regression for quantifying a regime change"
	#' author: "David Marx"
	#' date: "June 5, 2017"
	#' output: html_document
	#' ---

	#' There are two time points of interest. We want to test the hypothesis that the regression
	#' coefficients changed after these time points, respectively. We will accomplish this by introducing
	#' dummy variables to denote whether we are before or after a particular change point. This approach

dmarx / Arxiv Archive.md

Last active April 18, 2019 23:03

Machine learning articles I want to read or have read, mostly arxiv.org articles discussing recent advancements in deep learning.

To Read:

Publication Date	Article	Notes
2016	End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures	Cited in multi-task sciERC (2018, below)
2018-10-11	BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
	Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction	Probably a lot of useful citations in here, not sure we need the coreference stuff. * SciERC datasets: http://nlp.cs.washington.edu/sciIE/ * Code: https://bitbucket.org/luanyi/scierc/src/master/ * Pretrained (best) models: NER, Coref, Relation
2017-08-08	[Structural

dmarx / chinese restuarant process.R

Created April 3, 2017 22:56

Demonstration of a Chinese Restaurant Process, with an optional parameter to push the tables towards a uniform distribution rather than dirichlet (i.e. preferential attachment)

	# chinese restuarant process
	chinese_restaurant = function(n, uniform=FALSE){
	tables = c(1) # running counts of people at tables. Start by seating first person at their own table
	U = runif(n)
	for (i in 2:n){
	if(U[i]<1/i){
	tables = c(tables, 1)
	} else {
	p = tables/(i) # sum(tables) = i-1

dmarx / edge_weight_null_distribution.r

Created January 30, 2017 01:37

Simulate null hypothesis distribution for Serrano's disparity filter

	generate_distances = function(k){
	u_k = c(0,sort(runif(k-1)),1)
	u_k[-1] - u_k[-(k+1)]
	}

	iters=1e4

	d = c(replicate(iters, generate_distances(2)))
	plot(density(d), ylim=c(0,5))
	#abline(v=mean(d), lty=2)

dmarx / disparity_filter_dt.r

Last active December 19, 2017 12:00

Modified Alessandro Bessi's r implementation of Serrano's Disparity Filter to utilize the data.table package, imbuing orders of magnitude performance gains on calculation time (1.3 seconds for 543k nodes). Need to turn into a pull request or package fork. Original code: https://github.com/alessandrobessi/disparityfilter

	#' Extract the backbone of a weighted network using the disparity filter
	#'
	#' Given a weighted graph, \code{backbone} identifies the 'backbone structure'
	#' of the graph, using the disparity filter algorithm by Serrano et al. (2009).
	#' @param graph The input graph.
	#' @param weights A numeric vector of edge weights, which defaults to
	#' \code{E(graph)$weight}.
	#' @param directed The directedness of the graph, which defaults to the result
	#' of \code{\link[igraph]{is_directed}}.
	#' @param alpha The significance level under which to preserve the edges, which

dmarx / venn_intersection_text.R

Last active January 12, 2017 22:15

Rough method for drawing labels in intersections of a venn diagram drawn using R's `venneueler` package

	#install.packages('venneuler')
	library(venneuler)

	venn_intersection_text = function(venn, classes, label, adjustment=0.5, xadj=0, yadj=0 ){
	# fits a line between the centers of two classes and draws label text at the midpoint of that line + adjustment
	xv = adjustmentvenn$centers[classes[1],1] + (1-adjustment)venn$centers[classes[2],1] + xadj
	yv = adjustmentvenn$centers[classes[1],2] + (1-adjustment)venn$centers[classes[2],2] + yadj
	text(x=xv, y=yv, labels=label)
	}

Newer Older