Skip to content

Instantly share code, notes, and snippets.

View christophergandrud's full-sized avatar

Christopher Gandrud christophergandrud

View GitHub Profile
@christophergandrud
christophergandrud / quarter_sum.R
Last active December 17, 2015 19:28
A function for summing a variable across quarters
#' A function for summing a variable across quarters.
#'
#' @param data a data frame where the variables can be found.
#' @param Var a character string naming the variable to sum.
#' @param TimeVar a character string nameing the variable with the time variable. Note, must contain the month and the year.
#' @param NewVar a character string with the summed variable's name
#'
#' @return a data frame with Quarter and summed variable columns.
#' @importFrom plyr ddply
#' @export
@christophergandrud
christophergandrud / e.divGG.R
Last active March 31, 2020 04:58
Function for creating nonparamentric multiple change point plots with estimates from the ecp package.
#' Function for creating nonparamentric multiple change point plots with
#' estimates from the ecp package.
#'
#' @param data A data frame with the covariates and time variable.
#' @param Vars A character vector listing the names of the variates from
#' \code{data} to include in the nonparametric multiple change point analysis.
#' @param TimeVar A character string naming the time variable in \code{data}.
#' Must be in a format handled by POSIX, e.g. YYYY-MM-DD.
#' @param sig.lvl The level at which to sequentially test if a proposed change
#' point is statistically significant.
@christophergandrud
christophergandrud / d3SimpleNetwork.R
Last active March 7, 2017 07:28
An R function for creating simple D3 javascript directed network graphs.
#' An R function for creating simple D3 javascript directed network graphs.
#'
#' d3SimpleNetwork creates simple D3 javascript network graphs.
#'
#' @param data a data frame object with three columns. The first two are the names of the linked units. The third records an edge value. (Currently the third column doesn't affect the graph.)
#' @param Source character string naming the network source variable in the data frame. If \code{Source = NULL} then the first column of the data frame is treated as the source.
#' @param Target character string naming the network target variable in the data frame. If \code{Target = NULL} then the second column of the data frame is treated as the target.
#' @param height numeric height for the network graph's frame area.
#' @param width numeric width for the network graph's frame area.
#' @param file a character string of the file name to save the resulting graph. If a file name is given a standalone webpage is created, i.e. with a header and footer. If \code{file = NULL} then
@christophergandrud
christophergandrud / MarginalGames.R
Last active December 21, 2015 18:39
Creates marginal a effect ribbon plot from games (http://cran.r-project.org/web/packages/games/games.pdf) model objects
#' Creates a marginal effect or predicted probability ribbon plot for interaction effects estimated from games model objects
#'
#' \code{MarginalGames} marginal effect or predicted probability ribbon plot for interaction effects estimated from \code{\link{games}} model objects
#'
#' @param obj fitted model object created by \code{\link{egame12}}.
#' @param b1 character string of the first constitutive variable's name.
#' @param b2 character string of the second constitutive variable's name.
#' @param b12 character string of the interaction term's name
#' @param X1 numeric fitted value for \code{b1} to simulate for. Note: can only be set if \code{type = "prob"} and can only have one value.
#' @param X2 numeric vector of fitted values of \code{b2} to simulate for.
@christophergandrud
christophergandrud / grepl.sub.R
Last active December 22, 2015 05:08
Subsets a data frame if a specified pattern is found in a character string.
#' Subset a data frame if a specified pattern is found in a character string
#'
#' @param data data frame.
#' @param pattern character vector containing a regular expressions to be matched in the given character vector.
#' @param character vector of the variables that the pattern should be found in.
#' @param keep.found logical. whether or not to keep observations where the pattern is found (\code{TRUE}) or not found (\code{FALSE}).
#' @param useBytes logical. If TRUE the matching is done byte-by-byte rather than character-by-character. See \code{\link{grep}}.
grepl.sub <- function(data, patterns, var, keep.found = TRUE, useBytes = TRUE){
data$y <- grepl(pattern = paste0(patterns, collapse="|"), x = data[, var],
@christophergandrud
christophergandrud / coxLIML.R
Created January 9, 2014 17:26
Predict in sample hazard rates that could be used for limited information maximum likelihood two-stage selection in Cox proportional hazard models.
#' Predict in sample hazard rates that could be used for limited information maximum likelihood two-stage selection in Cox proportional hazard models
#'
#' @param obj a fitted model object created with \code{coxph}.
#' @param data the name of the data frame used to create \code{obj}.
#' @param idvar character string of the variable in \code{data} that identifies individuals.
#' @param timeVar character string of the variable in \code{data} that identifies the \code{time} variable used in \code{obj}.
#'
#' @return Returns the original \code{data} data frame with an additional variable added at the end called \code{PredictedHRate}. This contains the hazard rate predicted from the model \code{obj} for each observation.
#'
#' @importFrom survival basehaz
@christophergandrud
christophergandrud / WinsetCreator.R
Last active January 4, 2016 20:00
Creates the winset (W) and a modified version of the selectorate (S) variable from Bueno de Mesquita et al. (2003) using the most recent data available from Polity IV and the Database of Political Institutions.
###############
# Create winset from Bueno de Mesquita et al. (2003)
# Christopher Gandrud
# 29 January 2014
###############
#' Creates the winset (W) and a modified version of the selectorate (S) variable from Bueno de Mesquita et al. (2003) using the most recent data available from Polity IV and the Database of Political Institutions.
#'
#' @param PolityUrl character string. The URL for the Polity IV data set you would like to download. Note: it must be for the SPSS version of the file.
#' @param DpiUrl character string. The URL for the Database of Political Institutions data set you would like to download.
@christophergandrud
christophergandrud / ggs_caterpillar_label.R
Last active August 29, 2015 13:57
Modified version of ggs_caterpillar function from Xavier Fernández i Marín's ggmcmc R package
#' Caterpillar plot with thick and thin CI
#'
#' Caterpillar plots are plotted combining all chains for each parameter.
#'
#' @param D Data frame whith the simulations or list of data frame with simulations. If a list of data frames with simulations is passed, the names of the models are the names of the objects in the list.
#' @param X data frame with two columns, Parameter and the value for the x location. Parameter must be a character vector with the same names that the parameters in the D object.
#' @param family Name of the family of parameters to plot, as given by a character vector or a regular expression. A family of parameters is considered to be any group of parameters with the same name but different numerical value between square brackets (as beta[1], beta[2], etc).
#' @param thick_ci Vector of length 2 with the quantiles of the thick band for the credible interval
#' @param thin_ci Vector of length 2 with the quantiles of the thin band for the credible interval
#' @param line Numerical value in
@christophergandrud
christophergandrud / ggs_summary.R
Created March 24, 2014 10:40
Extract parameter estimates from ggs data frames.
#' Extract parameter point estimates
#'
#' @param D Data frame with the simulations or list of data frame with simulations. If a list of data frames with simulations is passed, the names of the models are the names of the objects in the list.
#' @param family Name of the family of parameters to plot, as given by a character vector or a regular expression. A family of parameters is considered to be any group of parameters with the same name but different numerical value between square brackets (as beta[1], beta[2], etc).
#' @param thick_ci Vector of length 2 with the quantiles of the thick band for the credible interval.
#' @param thin_ci Vector of length 2 with the quantiles of the thin band for the credible interval.
#' @param param_label_from Vector of strings (can be regular expressions) for the original paramater labels that you would like to replace in the plot lables using \code{param_label_to}.
#' @param param_label_to Vector of strings for paramater labels you would like to use. Must be both the sam
@christophergandrud
christophergandrud / R_Jags_AmazonEc2_Setup.sh
Last active October 28, 2017 20:26
Demonstration script for setting up R, Git, and Jags on Ubuntu instance of Amazon EC2 Server
# Set up RStudio and JAGS on an Amazon EC2 instance
# Using Ubuntu 64-bit
# Partially from http://blog.yhathq.com/posts/r-in-the-cloud-part-1.html
# See yhat for EC2 instance set up
# Navigate to key pair
# ssh -i YOUR_KEYPAIR.pem ubuntu@PUBLIC_DNS
# Add a user/password.
# This will become your RStudio username and password.