dantonnoriega’s gists

dantonnoriega / dark_widgetframe.R

Created January 9, 2019 04:27

my attempt to create a widgetframe with a dark background. hackish. seems to shift xaringan presentation oddly.

	# remove any old tmp files
	old_tmp_files <- list.files(pattern = '^tmp', include.dirs = TRUE, full.names = TRUE)
	invisible(unlink(old_tmp_files, recursive = TRUE))

	# create a dark widget frame via work around
	dark_widgetframe <-
	function(widget, background = '#666666FF', width = '100%', height = 420) {
	file = tempfile(pattern = "tmp", tmpdir = '.', fileext = ".html")
	selfcontained = FALSE
	libdir = NULL

dantonnoriega / dirichletreg-greta-imultilogit-ex.R

Last active January 16, 2019 23:07

translate the stan code from https://arxiv.org/pdf/1808.06399.pdf into greta code (https://greta-dev.github.io/greta/index.html) but use new imultilogit() function over simplex_mat() --- about 15% speed increase.

	# recreate the code from https://arxiv.org/pdf/1808.06399.pdf using greta (https://greta-dev.github.io/greta/)
	library("DirichletReg")
	Bld <- BloodSamples
	Bld <- na.omit(Bld)
	Bld$Smp <- DR_data(Bld[, 1:4])

	# using greta
	# !! requires the development version to run!
	# devtools::install_github("greta-dev/greta@dev")
	# convert data to matrix then greta data

dantonnoriega / remote-ssh-plus-local-cluster-future-example.R

Last active January 26, 2021 17:54

Example setting up a mixed remote / local cluster using the `future` package. Includes simple examples of how to properly execute plan to maximize cores. Note that this does not show how important it is to have the same R version AND package versions across all nodes.

	# set up ---------------------------
	library(furrr)

	# use multiple clusters
	ssh_username <- 'drn'
	remote_ssh_configs <- c('a', 'b', 'e') # names for remote server (found in ~/.ssh/config e.g. Host a)
	local_comp <- Sys.info()[["nodename"]] # get local computer name

	# build cluster ----------------------------------------------------------------
	system(command = "ps -axc \| grep ssh \| awk '{print $1}' \| sort -u \| xargs kill")

dantonnoriega / dirichletreg-greta-ex.R

Last active October 17, 2018 22:43

translate the stan code from https://arxiv.org/pdf/1808.06399.pdf into greta code (https://greta-dev.github.io/greta/index.html)

	# recreate the code from https://arxiv.org/pdf/1808.06399.pdf using greta (https://greta-dev.github.io/greta/)
	library("DirichletReg")
	Bld <- BloodSamples
	Bld <- na.omit(Bld)
	Bld$Smp <- DR_data(Bld[, 1:4])

	# simplex function. applies simplex to each row of a matrix.
	# HUGE speed gains using matrix mat vs for loop!
	simplex_mat <- function(x){
	exp_x <- exp(x)

dantonnoriega / dirichletreg-stan-ex.R

Created September 9, 2018 22:50

recreate the stan code from https://arxiv.org/pdf/1808.06399.pdf

	# stan code from https://arxiv.org/pdf/1808.06399.pdf
	library("DirichletReg")
	Bld <- BloodSamples
	Bld <- na.omit(Bld)
	Bld$Smp <- DR_data(Bld[, 1:4])

	stan_code <- '
	data {
	int<lower=1> N; // total number of observations
	int<lower=2> ncolY; // number of categories

dantonnoriega / deep_loops_in_stan.R

Last active August 16, 2018 19:03 — forked from khakieconomics/deep_loops_in_stan.R

Write your deep loops in Stan, not R. added a quick intro into how the simulations fill into the matrix

	library(tidyverse)
	library(rstan)

	## HOW THE SIMULATION LOOP BELOW WORKS -- ignore the shocks for now --------------
	# quick example
	I = 3 # individuals
	M = 5 # "months"
	S = 7 # sims
	mat <- matrix(NA, IM, S) # M rows (months) will be filled in at a time for each individual i across all S columns (simulations); records for individual i will populate rows ((i - 1)M + 1):(i*M)

dantonnoriega / tweedie-simulations-with-dispersion-estimate.R

Created April 16, 2018 07:25

I wanted to understand how to simulate counts from a tweedie distribution using fitted mu after using gam but didn't get how to estimate the dispersion parameter, phi. had to dig through code (stats::summary.glm) and through some papers to verify. looks good!

	# inspired by https://stats.stackexchange.com/questions/174121/can-a-model-for-non-negative-data-with-clumping-at-zeros-tweedie-glm-zero-infl
	# additions by Danton Noriega
	library(statmod)
	library(tweedie)
	library(mgcv)

	# generate fake mu (poisson count rates)
	set.seed(1789)
	x <- seq(1,100, by = .1)
	mutrue <- exp(-1+x/25)

dantonnoriega / data-table-favorites.R

Last active August 15, 2019 20:51

	# a running list of really useful data.table tricks

	# TOP 1,2 ... last row by some group id
	## source: https://stackoverflow.com/questions/16325641/is-it-possible-to-extract-the-first-2-rows-for-each-date#comment23381259_16325932
	## comment by @eddi
	id <- c('date', 'userid')
	dt[dt[, .I[1:2], by = id]$V1] # first 2 rows by id
	dt[dt[, .I[.N], by = id]$V1] # last row by id (.N = length of group, .I = row index)

dantonnoriega / major_holidays_2000_2025.csv

Last active January 26, 2018 01:03

list of some major holidays from 2000 - 2025. not expansive but hypothesis is that these dates correlate with high technology use.

year	date	holiday
2000	2000-01-01	New Year's Day
2000	2000-02-05	Chinese New Year
2000	2000-02-14	Valentine's Day
2000	2000-04-23	Easter Sunday
2000	2000-05-14	Mother's Day
2000	2000-06-18	Father's Day
2000	2000-07-04	Independence Day
2000	2000-10-31	Halloween
2000	2000-11-23	Thanksgiving Day

dantonnoriega / sublime-section-expands.py

Last active December 21, 2017 21:31

sublime text script to create two new expand commands: to end of file and expand to section. a section is defined by a header using the Rstudio syntax '# SECTION TITLE ----------' or anything with 4 hashes '####'. flexible with comment sections as well.

	import sublime, sublime_plugin

	class ExpandSelectionToEofCommand(sublime_plugin.TextCommand):
	def run(self, edit):
	v = self.view
	s = v.sel()
	eof = v.size()

	first_point = v.line(s[0]).a
	region_to_eof = sublime.Region(first_point, eof)

Danton Noriega-Goodwin dantonnoriega