Skip to content

Instantly share code, notes, and snippets.

@tmastny
Created June 7, 2018 20:33
Show Gist options
  • Save tmastny/25235a1f6bf43700797d16d34102ce01 to your computer and use it in GitHub Desktop.
Save tmastny/25235a1f6bf43700797d16d34102ce01 to your computer and use it in GitHub Desktop.
---
title: "Standard Non-Standard Evaluation"
output: html_notebook
---
```{r}
library(tidyverse)
```
```{r}
mtcars
```
```{r}
mtcars %>%
summarise(avg = mean(mpg))
```
```{r}
mtcars %>%
summarise(avg = mean(hp))
```
```{r}
mtcars %>%
summarise(avg = mean(wt))
```
## Hadley's Rule
Repeat yourself three times? Time for a function
```{r}
meaner <- function(data, column) {
data %>%
summarise(avg = mean(column))
}
```
```{r}
meaner(mtcars, hp)
```
```{r}
hp
```
```{r}
meaner(mtcars, "hp")
```
```{r}
mtcars %>%
summarise(avg = mean("hp"))
```
```{r}
meaner2 <- function(data, column) {
column <- enquo(column)
data %>%
summarise(avg = mean(!!column))
}
```
```{r}
meaner2(mtcars, hp)
```
## Enquo Magic!!!
`!!` and `enquo` magic worked!
Why?
## Names and Variables
```{r}
foo <- 4
```
The name of the variable is `foo`, but the value is 4
But what about `hp`?
```{r}
mtcars %>%
summarise(avg = mean(hp))
```
`hp` is not a variable. It has no value.
```{r}
hp
```
But the `hp` has meaning in the context of the `mtcars`:
```{r}
with(mtcars, mean(hp))
```
```{r}
mean(mtcars$hp)
```
So in the tidyverse, names have special meaning.
So why doesn't this work?
```{r}
meaner <- function(data, column) {
data %>%
summarise(avg = mean(column))
}
```
```{r}
meaner(mtcars, hp)
```
Because the tidyverse `summarise` is looking for the column named `column`.
Tidyverse only cares about *names* not *values*.
```{r}
meaner2 <- function(data, column) {
column <- enquo(column)
data %>%
summarise(avg = mean(!!column))
}
```
```{r}
meaner2(mtcars, hp)
```
`enquo` looks for the *name* typed in by the user: `hp`
`!!` tells tidyverse functions to use the *value* (which is a name)
## Why is it so complicated???
* most other programming languages don't use it:
> Zen of Python: Explicit is better than implicit
* progammers are taught to use variable *values* not *names*
## What are the benefits?
* Less typing during interactive programming
```{r}
mtcars %>%
filter(cyl == 6, mpg > 20, am == 1)
```
vs.
```{r}
mtcars[mtcars$cyl == 6 & mtcars$mpg > 20 & mtcars$am == 1, ]
```
* Can use complicated expressions in functions
```{r}
mtcars %>%
transmute(standardized = (hp - mean(hp)/sd(hp)))
```
```{r}
standardizer <- function(data, col) {
col <- enquo(col)
data %>%
transmute(standardized = (!!col - mean(!!col)/sd(!!col)))
}
```
```{r}
standardizer(mtcars, hp)
```
* Can use `!!` + `enquo` like normal variables:
- multiply, divide
- built-in functions
## What about strings?
I want this:
```{r}
mtcars %>%
group_by(cyl) %>%
summarise(count = n())
```
By using a string:
```{r}
column <- "cyl"
mtcars %>%
group_by(column) %>%
summarise(count = n())
```
Same problem as before:
- tidyverse uses the *name* not the *value*
There is no column with *name* `column`
Does `enquo` + `!!` work?
```{r}
enquoed_column <- enquo(column)
mtcars %>%
group_by(!!column) %>%
summarise(count = n())
```
No.
`enquo` only works for *names*.
We want the *value* `"cyl"` to be converted to a *name*.
Introducing: `sym`
```{r}
symed_column <- sym(column)
mtcars %>%
group_by(!!symed_column) %>%
summarise(count = n())
```
`!!` stays the same
We use `sym` instead of `enquo` if we want to turn
a *value* into a *name*
We can also put this into a function
```{r}
grouper <- function(data, col) {
col <- sym(col)
data %>%
group_by(!!col) %>%
summarise(count = n())
}
```
```{r}
grouper(mtcars, "cyl")
```
## Any questions?
### Any functions you want to see?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment