For this investigation we are going to use the sleepdata
data set from the lme4 package. Here is the head of the data frame:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# make fake dataset | |
df <- data.frame(x = runif(100, 0, 1), y = rnorm(100, 10, 3), z = rpois(100, 10)) | |
# subset dataframe | |
df_sub <- df[which(df$x >= 0.75), ] | |
# subset using dplyr | |
library(dplyr) | |
df_sub2 <- df %>% |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ggplot boxplot groups of continuous x | |
### Daniel J. Hocking | |
I am trying to make a boxplot with ggplot2 in R where the x-axis in continous but their are paired boxplots for each value of on the x-axis based on another factor with two possible values. I want to make a plot where boxplots are arranged by number of survey years on the x-axis but paired by spatialTF (2 boxplots for every value of n_years) but n_years are not evenly spaced. | |
This plot gets the paired boxplots correct by year but the years on the x-axis are evenly spaced and don't reflect the actual (continuous) time between years. | |
``` | |
ggplot(df_converged, aes(factor(n_years), mean_N_est)) + |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# table references | |
tbl_locations <- tbl(db, 'locations') %>% | |
rename(location_id=id, location_name=name, location_description=description) %>% | |
select(-created_at, -updated_at) | |
tbl_series <- tbl(db, 'series') %>% | |
rename(series_id=id) %>% | |
select(-created_at, -updated_at) | |
tbl_variables <- tbl(db, 'variables') %>% | |
rename(variable_id=id, variable_name=name, variable_description=description) %>% |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Get list of unique catchments with daymet data in our database | |
drv <- dbDriver("PostgreSQL") | |
con <- dbConnect(drv, dbname=...) | |
# con <- dbConnect(drv, dbname=...) | |
qry <- "SELECT DISTINCT featureid FROM daymet;" | |
result <- dbSendQuery(con, qry) | |
catchments <- fetch(result, n=-1) | |
catchments <- as.character(catchments$featureid) | |
# get daymet data for a subset of catchments |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# fetch temperature data | |
tbl_values <- left_join(tbl_series, | |
select(tbl_variables, variable_id, variable_name), | |
by=c('variable_id'='variable_id')) %>% | |
select(-file_id) %>% | |
filter(location_id %in% df_locations$location_id, | |
variable_name=="TEMP") %>% | |
left_join(tbl_values, | |
by=c('series_id'='series_id')) %>% | |
left_join(select(tbl_locations, location_id, location_name, latitude, longitude, featureid=catchment_id), |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We assumed stream temperature measurements were normally distributed following, | |
\\[ t_{s,h,d,y} \sim \mathcal{N}(\mu_{s,h,d,y}, \sigma) \\] | |
where $t_{s,h,d,y}$ is the observed stream water temperature at the site ($s$) within the sub-basin identified by the 8-digit Hydrologic Unit Code (HUC8; $h$) for each day ($d$) in each year ($y$). We describe the normal distribution with the standard deviation ($\sigma$). The expected temperature follows a linear trend | |
\\[ \omega_{s,h,d,y} = X^0 B^0 + X_{h}^{huc} B_{h}^{huc} + X_{s,h}^{site} B_{s,h}^{site} + X_{y}^{year} B_{y}^{year} \\] | |
but the expected temperature ($\mu_{s,h,d,y}$) is adjusted based on the residual error from the previous day |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#purling is in the knitr package | |
library(knitr) | |
setwd("C:/ALR/Models/boo") #example using local windows directory, can easily switch | |
#it can be really simple | |
purl( "script1.Rmd", "script1.R" ) | |
#just specify the rmd filename, then r filename, with extensions |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
one <- seq(1:10) | |
two <- rnorm(10) | |
three <- runif(10, 1, 2) | |
four <- -10:-1 | |
df <- data.frame(one, two, three) | |
df2 <- data.frame(one, two, three, four) | |
str(df) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Function for getting bootstrapped glmer predictions in parallel | |
glmmBoot <- function(dat, form, R, nc){ | |
# dat = data for glmer (lme4) logistic regression | |
# form = formula of glmer equation for fitting | |
# R = total number of bootstrap draws - should be multiple of nc b/c divided among cores evenly | |
# nc = number of cores to use in parallel | |
library(parallel) | |
cl <- makeCluster(nc) # Request # cores | |
clusterExport(cl, c("dat", "form", "nc", "R"), envir = environment()) # Make these available to each core |
NewerOlder