Last active
June 7, 2016 00:16
-
-
Save ivan-krukov/672cd415ac767b60ee6d206cf3aa511c to your computer and use it in GitHub Desktop.
Simpler logical subsetting by strings with `grep`
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
```{r, echo=FALSE} | |
library(magrittr) | |
``` | |
# `%grep%` operator | |
Using `R`'s built-in `grep` function is really inconvenient for interactive work. | |
There already exists a convenient `%in%` operator for testing membership in a sequence. | |
However, real data analysis rarely presents with well-defined sequences. | |
Strings are much more common. | |
## Implementation | |
```{r} | |
`%grep%` <- function(pattern, x) grepl(pattern, x) | |
``` | |
That's all. | |
## Usage | |
The `%grep%` operator functions the same way that `%in%` does, but the lookup is within a string. | |
Direct string matching: | |
```{r} | |
# Select 'setosa' entries | |
iris['setosa' %grep% iris$Species, ] %>% head | |
``` | |
Regular expressions: | |
```{r} | |
# Select all entries where species ends with 'a' | |
X <- iris['a$' %grep% iris$Species, ] | |
unique(X$Species) | |
``` | |
## More complicated example | |
This allows us to produce more complicated chained selections. | |
I always forget how to write a regular expression for "`one` but not `two`". | |
In case of the `%grep%` operator, we can use `R`'s logical functions. | |
Let's create some sample data: | |
```{r} | |
# Create some silly example | |
X <- data.frame( | |
value = rnorm(100), | |
fb = paste( | |
sample(c('foo', 'bar'), 100, replace=T), | |
sample(c('foo', 'bar'), 100, replace=T))) | |
X %>% head | |
``` | |
And apply `%grep%`: | |
```{r} | |
# Select foo without bar (same as `foo foo`) | |
X['foo' %grep% X$fb & !'bar' %grep% X$fb, ] %>% head | |
``` |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment