This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#get linear model residuals | |
#' @import dplyr | |
#' @export | |
dave_lm_adjust<-function(data,formula,test_vars,adjust=TRUE,progress=TRUE){ | |
if (progress == TRUE){ pb <- txtProgressBar(min = 0, max = ncol(data), style = 3)} else {pb<-NULL} | |
out <- lapply(1:length(test_vars), function(i) { | |
if (progress == TRUE) { | |
setTxtProgressBar(pb, i) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DATA SCIENCE EXERCISE | |
The following challenge requires the beer reviews data set called beer_reviews.csv. This data set can be downloaded from the following site: https://data.world/socialmediadata/beeradvocate . Note you can create a free temporary account to download this .csv. | |
Questions to answer using this data: | |
Which brewery produces the strongest beers by ABV%? | |
If you had to pick 3 beers to recommend using only this data, which would you pick? | |
Which of the factors (aroma, taste, appearance, palette) are most important in determining the overall quality of a beer? | |
Additional math/coding question unrelated to the data: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
> in | |
Error: unexpected 'in' in "in" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Basic PCA example | |
# use www.createdatasol.com for | |
# an advanced user interface | |
#required packages for plotting | |
library(ggplot2) | |
library(ggrepel) | |
#load data | |
data<-read.csv('~/Sampledata.csv', |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#initialize | |
library(shiny) | |
library(ggplot2) | |
library(purrr) | |
library(dplyr) | |
#example data | |
data(iris) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#' @title fast_tanimoto | |
#' @param mat matrix or data frame of numeric values | |
#' @param output 'matrix' (default) or 'edge list' (non-redundant and undirected) | |
#' @param progress TRUE, show progress | |
#' @imports reshape2 | |
fast_tanimoto<-function(mat,output='matrix',progress=TRUE){ | |
mat[is.na(mat)]<-0 | |
#scoring function | |
score<-function(x){sum(x==2)/sum(x>0)} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#plotly box or lasso select linked to | |
# DT data table | |
# using Wage data | |
# the out group: is sex:Male, region:Middle Atlantic + | |
library(ggplot2) | |
library(plotly) | |
library(dplyr) | |
library(ISLR) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#SOM example using wines data set | |
library(kohonen) | |
data(wines) | |
set.seed(7) | |
#create SOM grid | |
sommap <- som(scale(wines), grid = somgrid(2, 2, "hexagonal")) | |
## use hierarchical clustering to cluster the codebook vectors | |
groups<-3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(reshape2) | |
gen.mat.to.edge.list<-function(mat,symmetric=TRUE,diagonal=FALSE,text=FALSE){ | |
#create edge list from matrix | |
# if symmetric duplicates are removed | |
mat<-as.matrix(mat) | |
id<-is.na(mat) # used to allow missing | |
mat[id]<-"nna" | |
if(symmetric){mat[lower.tri(mat)]<-"na"} # use to allow missing values | |
if(!diagonal){diag(mat)<-"na"} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#R code, testing RECA with the iris data | |
library(RECA) | |
#test data | |
data(iris) | |
x<-iris[,-5] | |
y<-iris$Species | |
#similar groups (species) in each chunk (n=3) | |
chunksvec<-as.numeric(y) |
NewerOlder