Skip to content

Instantly share code, notes, and snippets.

@mrdwab
Last active December 25, 2015 18:19
Show Gist options
  • Save mrdwab/7019545 to your computer and use it in GitHub Desktop.
Save mrdwab/7019545 to your computer and use it in GitHub Desktop.
The `factor` function in R doesn't work nicely with duplicated levels, but there is a workaround using the `levels` function. This is a wrapper function to combine those two steps into one.
#' Factor vectors with multiple levels
#'
#' \code{\link{factor}} does not let you use duplicated levels nicely. It results
#' in an ugly warning message and you need to use \code{\link{droplevels}} to get
#' the desired output.
#'
#' The "solution" is to first factor the vector, and then use a named \code{list}
#' with the \code{\link{levels}} function. This function is a wrapper around
#' those steps.
#'
#' @param invec A \code{vector} that needs to be factored.
#' @param levels A named \code{list} of the levels. The \code{name} is the
#' level and the values are what should be mapped to those levels.
#' @param store Logical. Should the input values be stored as an attribute?
#' @param \dots Additional arguments to \code{factor}.
#' @return A factored variable with \code{class} of \code{factor} and
#' \code{Factor}, optionally with an \code{attribute} of \code{"Input"}
#' which stores the original input values.
#' @author Ananda Mahto
#' @seealso \code{\link{factor}}, \code{\link{levels}}
#' @references \url{http://stackoverflow.com/a/19410249/1270695}
#' @examples
#'
#' x <- c("Y", "Y", "Yes", "N", "No", "H")
#' Factor(x, list(Yes = c("Yes", "Y"), No = c("No", "N")))
#' Factor(x, list(Yes = c("Yes", "Y"), No = c("No", "N")), FALSE)
#'
#' @export Factor
Factor <- function(invec, levels = list(), store = TRUE, ...) {
Fac <- factor(invec, ...)
levels(Fac) <- levels
if (isTRUE(store)) attr(Fac, "Input") <- invec
class(Fac) <- c("Factor", class(Fac))
Fac
}
print.Factor <- function(x, ...) {
if (!is.null(attr(x, "Input"))) {
cat("Input values:\n")
print(attr(x, "Input"))
attr(x, "Input") <- NULL
cat("\n")
cat("Factored output:\n")
print.factor(x)
} else {
cat("Factored output:\n")
print.factor(x)
}
}
Restore <- function(invec) {
if (!"Factor" %in% class(invec)) stop("Wrong class of input.")
if (is.null(attr(invec, "Input"))) stop("No attribute named 'Input' found.")
attr(invec, "Input")
}
@mrdwab
Copy link
Author

mrdwab commented Oct 17, 2013

Example usage:

x <- c("Y", "Y", "Yes", "N", "No", "H")

y1 <- Factor(x, list(Yes = c("Yes", "Y"), No = c("No", "N")), FALSE)
y1
# Factored output:
# [1] Yes  Yes  Yes  No   No   <NA>
# Levels: Yes No
Restore(y1)
# Error in Restore(y1) : No attribute named 'Input' found.

y2 <- Factor(x, list(Yes = c("Yes", "Y"), No = c("No", "N")), TRUE)
y2
# Input values:
# [1] "Y"   "Y"   "Yes" "N"   "No"  "H"  
# 
# Factored output:
# [1] Yes  Yes  Yes  No   No   <NA>
# Levels: Yes No
Restore(y2)
# [1] "Y"   "Y"   "Yes" "N"   "No"  "H" 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment