Skip to content

Instantly share code, notes, and snippets.

@JoseRFJuniorLLMs
Created January 27, 2018 05:47
Show Gist options
  • Save JoseRFJuniorLLMs/344d09c38e2719407ac07b6a4905f59a to your computer and use it in GitHub Desktop.
Save JoseRFJuniorLLMs/344d09c38e2719407ac07b6a4905f59a to your computer and use it in GitHub Desktop.
# Data Preprocessing
# Importing the dataset
dataset = read.csv('Data.csv')
# Taking care of missing data
dataset$Age = ifelse(is.na(dataset$Age),
ave(dataset$Age, FUN = function(x) mean(x, na.rm = TRUE)),
dataset$Age)
dataset$Salary = ifelse(is.na(dataset$Salary),
ave(dataset$Salary, FUN = function(x) mean(x, na.rm = TRUE)),
dataset$Salary)
# Encoding categorical data
dataset$Country = factor(dataset$Country,
levels = c('France', 'Spain', 'Germany'),
labels = c(1, 2, 3))
dataset$Purchased = factor(dataset$Purchased,
levels = c('No', 'Yes'),
labels = c(0, 1))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment