Skip to content

Instantly share code, notes, and snippets.

View arthurwuhoo's full-sized avatar

Arthur Wu arthurwuhoo

  • Dataland
View GitHub Profile
# =====================================================================================================================
# OUTLIERS
# =====================================================================================================================
library(dplyr)
library(corrgram)
# Focus our attention on a subset of the baseball data.
#
baseball = select(baseball, Name, Atbatc:Walksc)
# =====================================================================================================================
# TRANSFORMATIONS
# =====================================================================================================================
# Focus our attention on a subset of the baseball data.
#
baseball = select(baseball, Name, Atbatc:Walksc)
# Box plots.
#
# =====================================================================================================================
# TRANSFORMATIONS
# =====================================================================================================================
# Focus our attention on a subset of the baseball data.
#
baseball = select(baseball, Name, Atbatc:Walksc)
# Box plots.
#
# =====================================================================================================================
# OUTLIERS
# =====================================================================================================================
library(dplyr)
library(corrgram)
# Focus our attention on a subset of the baseball data.
#
baseball = select(baseball, Name, Atbatc:Walksc)
##############################################################
# DAY 11: LINEAR REGRESSION EXERCISES
##############################################################
# 1) Height and Mass. Scrape the height and mass data from here.
# ----------------------------------------------------------------------------
library(rvest)
##############################################################
# DAY 11: LINEAR REGRESSION EXERCISES
##############################################################
# 1) Height and Mass. Scrape the height and mass data from here.
# ----------------------------------------------------------------------------
library(rvest)
# ------------------------------------------------------------------
# EXERCISE 3
# Use the birthwt data in the MASS package to construct a model for low birth
# weight. Are there any features which should be excluded from the model?
# ------------------------------------------------------------------
library(MASS)
library(caret)
# ------------------------------------------------------------------
# DAY 12 EXERCISES - LOGISTIC REGRESSION
# ------------------------------------------------------------------
# ------------------------------------------------------------------
# EXERCISE 1
# Create a parsimonious model for the myopia data. Does its performance differ
# substantially from the full model?
# ------------------------------------------------------------------
# ------------------------------------------------------------------
# DAY 12 EXERCISES - DECISION TREES
# ------------------------------------------------------------------
# ------------------------------------------------------------------
# EXERCISE 1
# Complete the iris modelling exercise. This is a multiclass problem. Some models
# support multiclass problems, others don’t. Decision trees do. Divide the data
# in a 60% training and 40% testing split. Create a model based on the training
# data.
sample.data <- read.csv("svm_sample.csv")
sample.data <- sample.data[,-1] #getting rid of id variables
library(caret)
train_index <- createDataPartition(sample.data$color, 0.8)[[1]]
sample.data.train <- sample.data[train_index,]
sample.data.test <- sample.data[-train_index,]