Skip to content

Instantly share code, notes, and snippets.

@timcdlucas
Last active July 23, 2019 07:43
Show Gist options
  • Save timcdlucas/acb8828563b93b6095ccb195835ffda3 to your computer and use it in GitHub Desktop.
Save timcdlucas/acb8828563b93b6095ccb195835ffda3 to your computer and use it in GitHub Desktop.
library(dplyr)
# If we don't have 'TOYOTA PRADO' in the set?
test <- c('ROVER', 'CRUISER', 'TOYOTA', 'TOYOTA PRADO', 'NISSAN')
old <- c('ROVER', 'CRUISER', 'TOYOTA')
new <- c('LandRover', 'LandRover', 'Toyota')
lookup <- data.frame(old, new)
data <- data.frame(DESCRIPTION = sample(test, 30, replace = TRUE))
categorise <- function(x){
i <- which(sapply(lookup$old, function(y) grepl(y, x)))
if(length(i) == 0) i <- NA
return(i)
}
data <- data %>%
mutate(make =
lookup$new[sapply(data$DESCRIPTION, categorise)]
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment