This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
################################################################ | |
# Reading in 2015-2016 Weather Data for Cape Town | |
################################################################ | |
library(rvest) | |
tables <- read_html("https://www.wunderground.com/history/airport/FACT/2015/6/21/CustomHistory.html?dayend=21&monthend=6&yearend=2016&req_city=&req_state=&req_statename=&reqdb.zip=&reqdb.magic=&reqdb.wmo=") | |
raw_weather <- tables %>% html_nodes(css="#obsTable") %>% .[[1]] %>% html_table(header = TRUE, fill = TRUE) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ------------------------------------------------------------------ | |
# DAY 14: CLUSTERING EXERCISES | |
# ------------------------------------------------------------------ | |
# ------------------------------------------------------------------ | |
# EXERCISE 1 | |
# Use kk Means Clustering to group the observations in the mtcars data. Is it | |
#important to standardise these data first? Vary the number of clusters and choose | |
# an appropriate value for kk. Interpret the clusters. | |
# ------------------------------------------------------------------ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ------------------------------------------------------------------ | |
# DAY 12 EXERCISES - CROSS VALIDATION | |
# ------------------------------------------------------------------ | |
# ------------------------------------------------------------------ | |
# EXERCISE 1 | |
# Build a Logistic Regression model classifying the credit status of | |
# customers (good or bad) in this data. Without using any packages, apply 5 | |
# -fold cross-validation on the model. Once you have five models (and five | |
# sets of predicted values), average them to in order to create a new |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sample.data <- read.csv("svm_sample.csv") | |
sample.data <- sample.data[,-1] #getting rid of id variables | |
library(caret) | |
train_index <- createDataPartition(sample.data$color, 0.8)[[1]] | |
sample.data.train <- sample.data[train_index,] | |
sample.data.test <- sample.data[-train_index,] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ------------------------------------------------------------------ | |
# DAY 12 EXERCISES - DECISION TREES | |
# ------------------------------------------------------------------ | |
# ------------------------------------------------------------------ | |
# EXERCISE 1 | |
# Complete the iris modelling exercise. This is a multiclass problem. Some models | |
# support multiclass problems, others don’t. Decision trees do. Divide the data | |
# in a 60% training and 40% testing split. Create a model based on the training | |
# data. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ------------------------------------------------------------------ | |
# DAY 12 EXERCISES - LOGISTIC REGRESSION | |
# ------------------------------------------------------------------ | |
# ------------------------------------------------------------------ | |
# EXERCISE 1 | |
# Create a parsimonious model for the myopia data. Does its performance differ | |
# substantially from the full model? | |
# ------------------------------------------------------------------ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ------------------------------------------------------------------ | |
# EXERCISE 3 | |
# Use the birthwt data in the MASS package to construct a model for low birth | |
# weight. Are there any features which should be excluded from the model? | |
# ------------------------------------------------------------------ | |
library(MASS) | |
library(caret) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
############################################################## | |
# DAY 11: LINEAR REGRESSION EXERCISES | |
############################################################## | |
# 1) Height and Mass. Scrape the height and mass data from here. | |
# ---------------------------------------------------------------------------- | |
library(rvest) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
############################################################## | |
# DAY 11: LINEAR REGRESSION EXERCISES | |
############################################################## | |
# 1) Height and Mass. Scrape the height and mass data from here. | |
# ---------------------------------------------------------------------------- | |
library(rvest) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ===================================================================================================================== | |
# OUTLIERS | |
# ===================================================================================================================== | |
library(dplyr) | |
library(corrgram) | |
# Focus our attention on a subset of the baseball data. | |
# | |
baseball = select(baseball, Name, Atbatc:Walksc) |