Skip to content

Instantly share code, notes, and snippets.

@Awuor87
Awuor87 / Predictive Modeling in R
Last active May 14, 2018 07:17
Predictive Modeling Code in R
bank= read.csv("bank-additional-full.csv", header=TRUE)
dim(bank)
head(bank)
# split the datasert into train and validation dataset
train.prop= .75 # set the training dataset at 75%
train.cases= sample(nrow(bank), nrow(bank)*train.prop)
@Awuor87
Awuor87 / Data Science in R
Created April 4, 2017 17:23
Predictive Modeling in R
bank= read.csv("bank-additional-full.csv", header=TRUE)
dim(bank)
head(bank)
# split the datasert into train and validation dataset
train.prop= .75 # set the training dataset at 75%
train.cases= sample(nrow(bank), nrow(bank)*train.prop)
@Awuor87
Awuor87 / Crosstabs in R
Created April 4, 2017 16:05
Sample Code for Crosstabs in R
crosstab<- read.csv("CrossTab_Example.csv", header= TRUE)
View(crosstab)
accountsize<- table(crosstab[,2]) # accountsize
dimnames(accountsize)<- list(c("small", "medium","large"))
dim(crosstab) # check the data has expected number of rows and columns
library(gmodels)
cbc.df<- read.csv("http://goo.gl/5xQObB")
View(cbc.df)
@Awuor87
Awuor87 / Data Science in Python
Created April 4, 2017 15:35
Data Science in Python
import glob
import codecs
import numpy
from pandas import DataFrame
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.pipeline import Pipeline
from sklearn.cross_validation import KFold
from sklearn.metrics import confusion_matrix, f1_score
@Awuor87
Awuor87 / Recommendation Engines in Python
Created April 4, 2017 15:32
Building a Recommendation Engine in Python
import networkx
from operator import itemgetter
import matplotlib.pyplot
# read the data from the amazon-books.txt;
# populate amazonProducts nested dicitonary;
# key = ASIN; value = MetaData associated with ASIN
fhr = open('./amazon-books.txt', 'r', encoding='utf-8', errors='ignore')
amazonBooks = {}
fhr.readline()
@Awuor87
Awuor87 / dplyr package in R
Created April 4, 2017 15:22
Sample code for dplyr package
## What is dplyr?
# dplyr is a grammar that makes data manipulation quick and easy.
# Install the dplyr library
install.packages('dplyr')
# load the dplyr library
library(dplyr)
@Awuor87
Awuor87 / Visualiation in R
Last active April 4, 2017 17:30
Sample code for ggplot in R
### ggplot2 is the most common used graphics package in R. It has great flexibility,
### allows for a wide variety of graphs, and has an easy to understand grammar.
# Install the ggplot2 library and load it.
install.packages("ggplot2")
library(ggplot2)
# Assign the diamonds data
diamond=diamonds
@Awuor87
Awuor87 / Loops in R
Created April 4, 2017 15:13
Sample Loop Code in R
# Define 2^2 test matrix
FullFact.2.2 <- function() {
a <- c(-1,1,-1,1)
b <- c(-1,-1,1,1)
ab <- a *b
df <- data.frame(a,b, ab)
rownames(df)<- c("(1)", "a", "b", "ab")
return(df)
}
(ff22<- FullFact.2.2())
@Awuor87
Awuor87 / Linear Regression Cntd
Last active April 4, 2017 17:30
Sample Code for Linear Regression in R
salarydata<- read.csv ("Salary_Data.csv", header = TRUE)
Salary= salarydata[,2]
Age= salarydata[,3]
Genderfactor= as.factor(salarydata[,4])#it converts the data to a discrete variable
resl= lm(Salary~Age+ Genderfactor)
# interaction
res2= lm(Salary~Age+ Genderfactor+ Age: Genderfactor)
summary(res2)