Daniel J. Hocking djhocking

Statistician with NOAA. Comments and opinions my own and do not represent those of NOAA or any other government organization

32 followers · 5 following

NOAA National Marine Fisheries Service
https://hockinglab.weebly.com/dr-daniel-hocking.html
@[email protected]
in/daniel-hocking-6bb17127

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

djhocking / DM_Multinomial

Last active August 29, 2015 13:56

Open Population Hierarchical Abundance Model with Removal Sampling

	# Model
	cat("
	model{
	# Abundance Priors
	a.N ~ dnorm(0, 0.01)
	b.elev ~ dnorm(0, 0.01)
	#b.elev2 ~ dnorm(0, 0.01)
	#b.slope ~ dnorm(0, 0.01)
	b.drainage ~ dnorm(0, 0.01)
	b.hardwood ~ dnorm(0, 0.01)

djhocking / ggplot_error_bars

Created March 12, 2014 20:47

Error bars using ggplot

	### ggplot ###
	# Refs: http://learnr.wordpress.com/2010/02/25/ggplot2-plotting-dates-hours-and-minutes/
	# http://had.co.nz/ggplot2/
	# http://wiki.stdout.org/rcookbook/Graphs/Plotting%20means%20and%20error%20bars%20%28ggplot2%29

	library(ggplot2)

	ggplot(data = gCount, aes(x = date, y = count, group = trt)) +
	#geom_point(aes(shape = factor(trt))) +
	geom_point(aes(colour = factor(trt), shape = factor(trt)), size = 3) +

djhocking / glmmBoot

Created May 21, 2014 14:12

Bootstrap mixed effects logistic regression predictions

	# Function for getting bootstrapped glmer predictions in parallel
	glmmBoot <- function(dat, form, R, nc){
	# dat = data for glmer (lme4) logistic regression
	# form = formula of glmer equation for fitting
	# R = total number of bootstrap draws - should be multiple of nc b/c divided among cores evenly
	# nc = number of cores to use in parallel

	library(parallel)
	cl <- makeCluster(nc) # Request # cores
	clusterExport(cl, c("dat", "form", "nc", "R"), envir = environment()) # Make these available to each core

djhocking / dplyr-select-names.R

Last active February 28, 2022 19:08

Select columns by vector of names using dplyr

	one <- seq(1:10)
	two <- rnorm(10)
	three <- runif(10, 1, 2)
	four <- -10:-1

	df <- data.frame(one, two, three)
	df2 <- data.frame(one, two, three, four)

	str(df)

djhocking / purl_wrapper_demo

Last active August 29, 2015 14:10 — forked from anarosner/purl_wrapper_demo

	#purling is in the knitr package
	library(knitr)


	setwd("C:/ALR/Models/boo") #example using local windows directory, can easily switch

	#it can be really simple
	purl( "script1.Rmd", "script1.R" )
	#just specify the rmd filename, then r filename, with extensions

djhocking / temperature_equations

Created January 7, 2015 04:20

First example of using Markdown with LaTeX equations

	We assumed stream temperature measurements were normally distributed following,

	\\[ t_{s,h,d,y} \sim \mathcal{N}(\mu_{s,h,d,y}, \sigma) \\]

	where $t_{s,h,d,y}$ is the observed stream water temperature at the site ($s$) within the sub-basin identified by the 8-digit Hydrologic Unit Code (HUC8; $h$) for each day ($d$) in each year ($y$). We describe the normal distribution with the standard deviation ($\sigma$). The expected temperature follows a linear trend

	\\[ \omega_{s,h,d,y} = X^0 B^0 + X_{h}^{huc} B_{h}^{huc} + X_{s,h}^{site} B_{s,h}^{site} + X_{y}^{year} B_{y}^{year} \\]

	but the expected temperature ($\mu_{s,h,d,y}$) is adjusted based on the residual error from the previous day

djhocking / gist:eff1072b54b6d8049270

Last active August 29, 2015 14:13

Big Queries and collect with dplyr

	# fetch temperature data
	tbl_values <- left_join(tbl_series,
	select(tbl_variables, variable_id, variable_name),
	by=c('variable_id'='variable_id')) %>%
	select(-file_id) %>%
	filter(location_id %in% df_locations$location_id,
	variable_name=="TEMP") %>%
	left_join(tbl_values,
	by=c('series_id'='series_id')) %>%
	left_join(select(tbl_locations, location_id, location_name, latitude, longitude, featureid=catchment_id),

djhocking / derive_metrics

Created January 15, 2015 21:29

R code to derive metrics for all catchments

	# Get list of unique catchments with daymet data in our database
	drv <- dbDriver("PostgreSQL")
	con <- dbConnect(drv, dbname=...)
	# con <- dbConnect(drv, dbname=...)
	qry <- "SELECT DISTINCT featureid FROM daymet;"
	result <- dbSendQuery(con, qry)
	catchments <- fetch(result, n=-1)
	catchments <- as.character(catchments$featureid)

	# get daymet data for a subset of catchments

djhocking / temperature_data

Created February 11, 2015 19:23

Pull temperature data from the database


	# table references
	tbl_locations <- tbl(db, 'locations') %>%
	rename(location_id=id, location_name=name, location_description=description) %>%
	select(-created_at, -updated_at)
	tbl_series <- tbl(db, 'series') %>%
	rename(series_id=id) %>%
	select(-created_at, -updated_at)
	tbl_variables <- tbl(db, 'variables') %>%
	rename(variable_id=id, variable_name=name, variable_description=description) %>%

djhocking / ggplot boxplot continuous x with groups

Last active February 12, 2016 20:58

Trying to make a boxplot with ggplot2 in R where the x-axis in continous but their are paired boxplots for each value of on the x-axis based on another factor

	# ggplot boxplot groups of continuous x

	### Daniel J. Hocking

	I am trying to make a boxplot with ggplot2 in R where the x-axis in continous but their are paired boxplots for each value of on the x-axis based on another factor with two possible values. I want to make a plot where boxplots are arranged by number of survey years on the x-axis but paired by spatialTF (2 boxplots for every value of n_years) but n_years are not evenly spaced.

	This plot gets the paired boxplots correct by year but the years on the x-axis are evenly spaced and don't reflect the actual (continuous) time between years.

	```
	ggplot(df_converged, aes(factor(n_years), mean_N_est)) +

OlderNewer