Below are links to manuscripts or packages related to my talk at JSM on Supervised Dimensionality Reduction. Session info here.
My slides will be here.
Contact:
roll_die <- function(status, strategy) { | |
roll = sample(c(names(status), "basket"), 1) | |
if (roll == "basket") { | |
trees = status[names(status) != "raven" & status > 0] | |
if (strategy == "optimal") { | |
biggest_trees = trees[trees == max(trees)] | |
roll = sample(names(biggest_trees), 1) | |
} else if (strategy == "random") { | |
roll = sample(names(trees), 1) | |
} else if (strategy == "worst") { |
I will not share anyone's information. I will only use aggregated data with no personal information.
Title | URL | |
---|---|---|
Data Clustering C++ | http://www.amazon.com/Data-Clustering-Object-Oriented-Knowledge-Discovery/dp/1439862230 | |
Transportation Statistics and Microsimulation | http://www.amazon.com/Transportation-Statistics-Microsimulation-Clifford-Spiegelman/dp/1439800235 | |
Fundamentals of Transportation and Traffic Operations | http://www.amazon.com/Fundamentals-Transportation-Traffic-Operations-Daganzo/dp/0080427855 | |
A First Course in Stochastic Processes | http://www.amazon.com/First-Course-Stochastic-Processes-Second/dp/0123985528 | |
A Probability Path | http://www.amazon.com/A-Probability-Path-Sidney-Resnick/dp/081764055X | |
A Primer on Linear Models | http://www.amazon.com/Primer-Linear-Chapman-Statistical-Science/dp/1420062018 | |
Statistical Approach to Genetic Epidemiology | http://www.amazon.com/Statistical-Approach-Genetic-Epidemiology-Applications/dp/3527323899 | |
Intro Trans Engineering | http://www.amazon.com/Introduction-Transportation-Engineering-Banks-James/dp/0072431881 |
See this link for an introduction on time stacking and time slicing.
time_slice.R
requires the number of pixels wide or tall the image is to be a multiple of the number of images in your timelapse.
time_slice_v2.R
attempts to get around this. Some images will contribute more pixels per slice than others. This is done by making the first x%
of the images cover the first x%
of the pixels (with appropriate rounding). It does not deal with number of images being greater than the height or width of the images in pixels. Version 2 will probably work better for you.
For example, if the images are 150 pixels wide and your timelapse has 100 images, time_slice.R
will make the first image have a slice which is 51 pixels wide. The remaining 99 images will get slices which are 1 pixel wide. time_slice_v2.R
will alternate between 1 pixel per i
library(XML) | |
library(lubridate) | |
library(sqldf) | |
library(reshape2) | |
library(ggplot2) | |
library(mgcv) | |
cat("loading old data...\n") | |
playlist=read.csv("CD101Playlist.csv",stringsAsFactors=FALSE) | |
colnames(playlist)[3]="Last Played" |
library(shiny) | |
library(ggplot2) | |
exp.age.df=read.csv("https://dl.dropboxusercontent.com/u/17648661/ExpAgeByNameYear.csv") | |
age.range=range(exp.age.df$Age) | |
unique.names=sort(unique(exp.age.df$Name)) | |
unique.names=c("<NONE>",as.character(unique.names)) | |
start.names=c("Andrew","Dylan","Fred","Grace","Lillian","John") |
# rm(list=ls()) | |
setwd("Kaggle/Amazon Employee") | |
train = read.csv("train.csv") | |
test = read.csv("test.csv") | |
train$ROLE_TITLE <- NULL # Because the same as ROLE_CODE | |
test$ROLE_TITLE <- NULL # Because the same as ROLE_CODE | |
jaccard <- function(vec, matrix) { | |
rowSums(as.matrix(sweep(matrix, 2, as.numeric(vec), "=="))) |
# inspired by http://schamberlain.github.io/2012/01/logistic-regression-barplot-fig/ | |
logithistplot <- function(data,breaks="Sturges",se=TRUE) { | |
require(ggplot2); | |
col_names=names(data) | |
# get min and max axis values | |
min_x <- min(data[,1]) | |
max_x <- max(data[,1]) | |
# get bin numbers |