This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bank= read.csv("bank-additional-full.csv", header=TRUE) | |
dim(bank) | |
head(bank) | |
# split the datasert into train and validation dataset | |
train.prop= .75 # set the training dataset at 75% | |
train.cases= sample(nrow(bank), nrow(bank)*train.prop) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bank= read.csv("bank-additional-full.csv", header=TRUE) | |
dim(bank) | |
head(bank) | |
# split the datasert into train and validation dataset | |
train.prop= .75 # set the training dataset at 75% | |
train.cases= sample(nrow(bank), nrow(bank)*train.prop) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
crosstab<- read.csv("CrossTab_Example.csv", header= TRUE) | |
View(crosstab) | |
accountsize<- table(crosstab[,2]) # accountsize | |
dimnames(accountsize)<- list(c("small", "medium","large")) | |
dim(crosstab) # check the data has expected number of rows and columns | |
library(gmodels) | |
cbc.df<- read.csv("http://goo.gl/5xQObB") | |
View(cbc.df) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import glob | |
import codecs | |
import numpy | |
from pandas import DataFrame | |
from sklearn.naive_bayes import MultinomialNB | |
from sklearn.feature_extraction.text import CountVectorizer | |
from sklearn.feature_extraction.text import TfidfTransformer | |
from sklearn.pipeline import Pipeline | |
from sklearn.cross_validation import KFold | |
from sklearn.metrics import confusion_matrix, f1_score |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import networkx | |
from operator import itemgetter | |
import matplotlib.pyplot | |
# read the data from the amazon-books.txt; | |
# populate amazonProducts nested dicitonary; | |
# key = ASIN; value = MetaData associated with ASIN | |
fhr = open('./amazon-books.txt', 'r', encoding='utf-8', errors='ignore') | |
amazonBooks = {} | |
fhr.readline() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## What is dplyr? | |
# dplyr is a grammar that makes data manipulation quick and easy. | |
# Install the dplyr library | |
install.packages('dplyr') | |
# load the dplyr library | |
library(dplyr) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### ggplot2 is the most common used graphics package in R. It has great flexibility, | |
### allows for a wide variety of graphs, and has an easy to understand grammar. | |
# Install the ggplot2 library and load it. | |
install.packages("ggplot2") | |
library(ggplot2) | |
# Assign the diamonds data | |
diamond=diamonds |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Define 2^2 test matrix | |
FullFact.2.2 <- function() { | |
a <- c(-1,1,-1,1) | |
b <- c(-1,-1,1,1) | |
ab <- a *b | |
df <- data.frame(a,b, ab) | |
rownames(df)<- c("(1)", "a", "b", "ab") | |
return(df) | |
} | |
(ff22<- FullFact.2.2()) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
salarydata<- read.csv ("Salary_Data.csv", header = TRUE) | |
Salary= salarydata[,2] | |
Age= salarydata[,3] | |
Genderfactor= as.factor(salarydata[,4])#it converts the data to a discrete variable | |
resl= lm(Salary~Age+ Genderfactor) | |
# interaction | |
res2= lm(Salary~Age+ Genderfactor+ Age: Genderfactor) | |
summary(res2) |