Skip to content

Instantly share code, notes, and snippets.

@Oreotrephes
Last active March 16, 2019 16:36
Show Gist options
  • Select an option

  • Save Oreotrephes/72613cac2e54f92e1d2a81f3df98b89d to your computer and use it in GitHub Desktop.

Select an option

Save Oreotrephes/72613cac2e54f92e1d2a81f3df98b89d to your computer and use it in GitHub Desktop.
require(dplyr)
#Goal - collapse information from duplicate rows
name<-c("John","John","Frank","Frank","Sue","Sue","Bob")
rank<-c("","sergeant",""," ","lieutenant","major","")
force<-c("army","","navy","navy","army","army","navy")
testdf<-data.frame(name,rank,force)
testdf
#Test 1 dplyr with normal paste
output1 <- testdf %>%
group_by(name) %>%
summarize_all (paste,collapse=", ")
#Test 1 - it did what we want -- John's row is collapsed, but we'd rather not see duplicates and orphaned seperators
output1
#Define a conditional paste function
pastefun <- function(x){
x<-unique(x)
if(is.na(x[2])) {
x[1]
} else if(x[1]==x[2]) {
x[1]
} else if(x[2]=="") {
x[1]
} else if(x[1]=="") {
x[2]
} else {
paste(x,collapse=", ")
}
}
#test it on various inputs
test1<-"major"
test2<-c("major")
test3<-c("major","")
test4<-c("major","major")
test5<-c("major","lieutenant")
test6<-c("major","major","lieutenant")
#these all work
pastefun(test1)
pastefun(test2)
pastefun(test3)
pastefun(test4)
pastefun(test5)
pastefun(test6)
#this works too, I had expected it to fail
output2 <- testdf %>%
group_by(name) %>%
mutate_all(as.character) %>%
summarize_all (pastefun)
output2
@tylerritchie
Copy link
Copy Markdown

pastefun <- function(x){
  x<-unique(x)
  if(is.na(x[2])) {
    x[1]
  } else if(x[2]=="") {
    x[1]
  } else if(x[1]==x[2]) {
    x[1]
  } else {
    paste(x,collapse=", ")
  } 
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment