Skip to content

Instantly share code, notes, and snippets.

@MNoorFawi
Created May 29, 2020 21:33
Show Gist options
  • Save MNoorFawi/d4323b5070fb99106a06cdf856eca9b1 to your computer and use it in GitHub Desktop.
Save MNoorFawi/d4323b5070fb99106a06cdf856eca9b1 to your computer and use it in GitHub Desktop.
One-Hot Encoding in R
### one-hot encoding
vars <- colnames(data)
## to one hot encode factor values and normalize numeric ones if needed
cat_vars <- vars[sapply(data[, vars], class) %in%
c("factor", "character", "logical")]
data2 <- data[, cat_vars]
for (i in cat_vars) {
dict <- unique(data2[, i])
for (key in dict) {
data2[[paste0(i, "_", key)]] <- 1.0 * (data2[, i] == key)
}
}
# to remove the original categorical variables
#data[, which(colnames(data) %in% cat_vars)] <- NULL
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment