Last active
December 21, 2018 02:46
-
-
Save gweissman/eff6edc7733c1a0556d2c77c21c84502 to your computer and use it in GitHub Desktop.
one-hot encodes a variable and returns a data.table with new columns
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# A helper function to one-hot encode a _single_ variable | |
# Assumes data.table column or vector input | |
# Returns a data.table with x columns where x is number of levels | |
ohev <- function(vars, drop_ref = TRUE) { | |
vname <- deparse(substitute(vars)) | |
lvls <- unlist(unique(vars)) | |
tmp_list <- list() | |
for (lev in lvls) { | |
tmp_list[[make.names(paste0(vname,'_',lev))]] <- as.numeric(vars == lev) | |
} | |
res <- as.data.table(tmp_list) | |
if (drop_ref) { | |
return(res[,-1]) | |
} else { | |
return(res) | |
} | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment