Skip to content

Instantly share code, notes, and snippets.

@aammd
Created April 4, 2016 04:38
Show Gist options
  • Save aammd/30409986491f60232476ee64c63fdee5 to your computer and use it in GitHub Desktop.
Save aammd/30409986491f60232476ee64c63fdee5 to your computer and use it in GitHub Desktop.
how to filter out rarities in R
library(dplyr)
d <- data_frame(A =c(rep("a", 5), rep("c",2)))
d
## one way to do it
d %>%
group_by(A) %>%
filter(length(A) > 2)
## longer, but easier on the eyes.
rare_groups <- d %>%
group_by(A) %>%
tally %>%
filter(n <= 2)
## examine rare_groups to observe which groups are called "rare"
## remove them from original data with an "anti join":
d %>%
anti_join(rare_groups)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment