David Marx dmarx

Engineer / Machine Learning Researcher interested in deep learning, probabilistic ML, generative models, multi-modal SSL, visual understanding, geometric

558 followers · 371 following

CoreWeave, EleutherAI
Seattle, WA
http://dmarx.github.io
@digthatdata.bsky.social
@DigThatData

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

dmarx / undirected_to_directed_bipartite_projection.R

Last active January 3, 2017 20:54

Novel (?) technique for inferring a directed bipartite projection from an undirected bipartite graph. Code is for pedagogical demonstration to accompany the article here: http://dmarx.github.io/map-of-reddit-by-active-users/

	library(igraph)

	# Experiment parameters
	n=10 # Primary class (i.e. subreddits)
	m=100 # Secondary class (i.e. users)
	threshold = .5 # edge threshold

	######################################

	seed(123)

dmarx / election_rage.py

Last active November 8, 2016 22:10

Get a pulse on the election-relevant conversation on reddit by streaming sentences containing some relevant terms

	import praw
	import string
	import re
	import nltk

	r = praw.Reddit('anger fuel comment monitor, by /u/shaggorama')
	targets = ['hillary', 'trump', 'hilary', 'election']
	punc_pat = re.compile('['+string.punctuation+']')

	blacklist = ['AutoModerator', '2016VoteBot']

dmarx / dynamic_edgelist_demo.r

Created September 2, 2016 20:57

Given an edgelist of a dynamic graph in the form of (timestamp, source, target) triplets, construct a compressed edgelist in the form (onset, terminus, source, target)

	#' Try to construct a dynamic graph object from an edgelist with sequential timestamps, to use render.d3movie per:
	#' https://rpubs.com/kateto/netviz
	#'

	#install.packages('statnet')
	#install.packages("ndtv")

	library(igraph)
	library(statnet)
	library(ndtv)

dmarx / binomial_algorithm_benchmarking.py

Created February 29, 2016 19:37

Experiments homebrewing a binomial CDF (or approximation to binomial CDF) to enable a poisson hypothesis test for "burst" scoring in an oracle environment. Unavailable functions: factorial, nCr, dbinom, pbinom, dnorm, pnorm.

	'''
	Binomial coefficient algorithm tests

	Doing this in python just for basic prototyping, but we'll need to ultimately
	port this to oracle.
	'''

	from __future__ import division
	import timeit
	import math

dmarx / TPOT.export.py

Last active December 18, 2015 17:07

Anticipated form of TPOT.export after refactoring is completed


	def export(self, output_file_name):
	"""Exports the current optimized pipeline as Python code.

	Parameters
	----------
	output_file_name: string
	String containing the path and file name of the desired output file

	Returns

dmarx / KNNc.py

Last active December 17, 2015 21:14

Untested (and definitely non-working) demo for proposed class template for TPOT operator modularization (https://github.com/rhiever/tpot)

	# tpot/operators/KNNc.py

	from base import LearnerOperator
	from sklearn.neighbors import KNeighborsClassifier
	import pandas as pd

	class KNNc(LearnerOperator):
	def __init__(self):
	super(KNNc, self).__init__(
	func = KNeighborsClassifier,

dmarx / naive_bayes_demo.R

Last active November 5, 2015 23:43

	#' # Constructing a naive bayes classifer from scratch
	#'
	#' ## Background: bayes rule
	#'
	#' Recall bayes rule:
	#'
	#' $$P(\theta\|X) = \frac{P(X\|\theta)P(\theta)}{P(X)}$$
	#'
	#' Each components of this formula has a name:
	#'

dmarx / package installations.R

Last active October 29, 2018 21:17

	install.packages('caret')
	install.packages('ccd')
	install.packages('d3Network')
	install.packages('data.table')
	install.packages('dplyr')
	install.packages('DMwR')
	install.packages('e1071')
	install.packages('ergm')
	install.packages('ff')
	install.packages('foreach')

dmarx / denseMatrix_to_sparseMatrix.R

Last active March 30, 2017 19:25

Code snippet demonstrating a vectorized method for transforming a dense matrix to a sparse matrix in R


	#' This is the right way
	dense_to_sparse = function(m, binary=FALSE){
	library(Matrix)
	xy = which(abs(m)>0, arr.ind=TRUE)
	if(binary){
	dense = sparseMatrix(i=xy[,1], j=xy[,2], x=1, dims=dim(m) )
	} else {
	dense = sparseMatrix(i=xy[,1], j=xy[,2], x=m[xy], dims=dim(m) )
	}

dmarx / Extreme Value Arrivals.Rmd

Created October 8, 2015 14:08

	#' Arrival rate of new max values given some generating distribution
	#'
	#' To do: Wrapper function to perform repeated simulations from same generating distribution.
	#' AFter each iteration, convert output into pairs of (time last max observed, wait time to next max).
	#' Spaghetti plot to try to infer a conditional distribution of wait time to next max, given when
	#' last max observed. Some kind of exteme value distribution, probably Gumbel or Frechet or something...

	```{r}
	set.seed(123)

Newer Older