This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
df = pd.DataFrame([['01-02-2015', 'a', 17], | |
['01-09-2015', 'a', 42], | |
['01-30-2015', 'a', 19], | |
['01-02-2015', 'b', 23], | |
['01-23-2015', 'b', 1], | |
['01-30-2015', 'b', 13]]) | |
df.columns = ['date', 'group', 'response'] | |
df.set_index(['date', 'group'], inplace=True) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
from statsmodels.robust.scale import huber | |
# Mean and standard deviation to generate normal random variates | |
mean, std_dev = 0, 2 | |
sample_size = 25 | |
np.random.seed(42) | |
x = np.random.normal(mean, std_dev, sample_size) | |
# Appends a couple of outliers |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#' Cuts a vector into factors with pretty levels | |
#' | |
#' @param x numeric vectory | |
#' @param breaks numeric vector of two ore more unique cut points | |
#' @param collapse character string to collapse factor labels | |
#' @param ... arguments passed to \code{\link[base]{cut}} | |
#' @return A \code{\link{factor}} is returned | |
#' | |
#' @examples | |
#' set.seed(42) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(randomForest) | |
library(dplyr) | |
library(ggplot2) | |
set.seed(42) | |
rf_out <- randomForest(Species ~ ., data=iris) | |
# Extracts variable importance (Mean Decrease in Gini Index) | |
# Sorts by variable importance and relevels factors to match ordering |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(dplyr) | |
objects <- ls() | |
object_sizes <- sapply(objects, function(x) object.size(get(x))) | |
object_sizes <- data.frame(objects, object_sizes, row.names=NULL) | |
object_sizes$units_MB <- utils:::format.object_size(object_sizes$object_sizes, units="Mb") | |
dplyr::arrange(object_sizes, object_sizes) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# For the standard conjugate beta prior for a binomial likelihood, a typical | |
# approach is to weight each prior observation equally, there are times where | |
# the prior Bernoulli trials should be weighted over time, so that the more | |
# recent trials are weighted near 1 and the oldest trials should be weighted | |
# near 0. | |
# Gompertz Function | |
# http://en.wikipedia.org/wiki/Gompertz_function | |
gompertz <- function(x, a=1, b=1, c=1) { | |
a * exp(-b * exp(-c * x)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import itertools | |
def jaccard(labels1, labels2): | |
""" | |
Computes the Jaccard similarity between two sets of clustering labels. | |
The value returned is between 0 and 1, inclusively. A value of 1 indicates | |
perfect agreement between two clustering algorithms, whereas a value of 0 | |
indicates no agreement. For details on the Jaccard index, see: | |
http://en.wikipedia.org/wiki/Jaccard_index |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(clusteval) | |
adjacency_matrix <- function(cluster_labels, names=NULL, force_symmetric=FALSE) { | |
adj_matrix <- diag(length(cluster_labels)) | |
adj_matrix[lower.tri(adj_matrix)] <- clusteval::comembership(cluster_labels) | |
if (force_symmetric) { | |
adj_matrix <- adj_matrix + t(adj_matrix) | |
} | |
diag(adj_matrix) <- 0 | |
if (!is.null(names)) | |
rownames(adj_matrix) <- colnames(adj_matrix) <- names |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Comment thread beginning here: | |
# http://ramhiser.com/blog/2013/07/02/a-brief-look-at-mixture-discriminant-analysis/#comment-1374749931 | |
# | |
# I'm using version 0.4-4 of the `mda` package | |
library(mda) | |
test_data <- read.csv("ts2.csv") | |
colnames(test_data)[16] <- "filter" | |
mda_out <- mda(formula=fol_up_u ~ . - filter, | |
data=test_data, | |
CV=TRUE, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#' Try/catch with exponential backoff | |
#' | |
#' Attempts the expression in \code{expr} up to the number of tries specified in | |
#' \code{max_attempts}. Each time a failure results, the functions sleeps for a | |
#' random amount of time before re-attempting the expression. The upper bound of | |
#' the backoff increases exponentially after each failure. | |
#' | |
#' For details on exponential backoff, see: | |
#' \url{http://en.wikipedia.org/wiki/Exponential_backoff} | |
#' |