Created
June 7, 2018 20:33
-
-
Save tmastny/25235a1f6bf43700797d16d34102ce01 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "Standard Non-Standard Evaluation" | |
output: html_notebook | |
--- | |
```{r} | |
library(tidyverse) | |
``` | |
```{r} | |
mtcars | |
``` | |
```{r} | |
mtcars %>% | |
summarise(avg = mean(mpg)) | |
``` | |
```{r} | |
mtcars %>% | |
summarise(avg = mean(hp)) | |
``` | |
```{r} | |
mtcars %>% | |
summarise(avg = mean(wt)) | |
``` | |
## Hadley's Rule | |
Repeat yourself three times? Time for a function | |
```{r} | |
meaner <- function(data, column) { | |
data %>% | |
summarise(avg = mean(column)) | |
} | |
``` | |
```{r} | |
meaner(mtcars, hp) | |
``` | |
```{r} | |
hp | |
``` | |
```{r} | |
meaner(mtcars, "hp") | |
``` | |
```{r} | |
mtcars %>% | |
summarise(avg = mean("hp")) | |
``` | |
```{r} | |
meaner2 <- function(data, column) { | |
column <- enquo(column) | |
data %>% | |
summarise(avg = mean(!!column)) | |
} | |
``` | |
```{r} | |
meaner2(mtcars, hp) | |
``` | |
## Enquo Magic!!! | |
`!!` and `enquo` magic worked! | |
Why? | |
## Names and Variables | |
```{r} | |
foo <- 4 | |
``` | |
The name of the variable is `foo`, but the value is 4 | |
But what about `hp`? | |
```{r} | |
mtcars %>% | |
summarise(avg = mean(hp)) | |
``` | |
`hp` is not a variable. It has no value. | |
```{r} | |
hp | |
``` | |
But the `hp` has meaning in the context of the `mtcars`: | |
```{r} | |
with(mtcars, mean(hp)) | |
``` | |
```{r} | |
mean(mtcars$hp) | |
``` | |
So in the tidyverse, names have special meaning. | |
So why doesn't this work? | |
```{r} | |
meaner <- function(data, column) { | |
data %>% | |
summarise(avg = mean(column)) | |
} | |
``` | |
```{r} | |
meaner(mtcars, hp) | |
``` | |
Because the tidyverse `summarise` is looking for the column named `column`. | |
Tidyverse only cares about *names* not *values*. | |
```{r} | |
meaner2 <- function(data, column) { | |
column <- enquo(column) | |
data %>% | |
summarise(avg = mean(!!column)) | |
} | |
``` | |
```{r} | |
meaner2(mtcars, hp) | |
``` | |
`enquo` looks for the *name* typed in by the user: `hp` | |
`!!` tells tidyverse functions to use the *value* (which is a name) | |
## Why is it so complicated??? | |
* most other programming languages don't use it: | |
> Zen of Python: Explicit is better than implicit | |
* progammers are taught to use variable *values* not *names* | |
## What are the benefits? | |
* Less typing during interactive programming | |
```{r} | |
mtcars %>% | |
filter(cyl == 6, mpg > 20, am == 1) | |
``` | |
vs. | |
```{r} | |
mtcars[mtcars$cyl == 6 & mtcars$mpg > 20 & mtcars$am == 1, ] | |
``` | |
* Can use complicated expressions in functions | |
```{r} | |
mtcars %>% | |
transmute(standardized = (hp - mean(hp)/sd(hp))) | |
``` | |
```{r} | |
standardizer <- function(data, col) { | |
col <- enquo(col) | |
data %>% | |
transmute(standardized = (!!col - mean(!!col)/sd(!!col))) | |
} | |
``` | |
```{r} | |
standardizer(mtcars, hp) | |
``` | |
* Can use `!!` + `enquo` like normal variables: | |
- multiply, divide | |
- built-in functions | |
## What about strings? | |
I want this: | |
```{r} | |
mtcars %>% | |
group_by(cyl) %>% | |
summarise(count = n()) | |
``` | |
By using a string: | |
```{r} | |
column <- "cyl" | |
mtcars %>% | |
group_by(column) %>% | |
summarise(count = n()) | |
``` | |
Same problem as before: | |
- tidyverse uses the *name* not the *value* | |
There is no column with *name* `column` | |
Does `enquo` + `!!` work? | |
```{r} | |
enquoed_column <- enquo(column) | |
mtcars %>% | |
group_by(!!column) %>% | |
summarise(count = n()) | |
``` | |
No. | |
`enquo` only works for *names*. | |
We want the *value* `"cyl"` to be converted to a *name*. | |
Introducing: `sym` | |
```{r} | |
symed_column <- sym(column) | |
mtcars %>% | |
group_by(!!symed_column) %>% | |
summarise(count = n()) | |
``` | |
`!!` stays the same | |
We use `sym` instead of `enquo` if we want to turn | |
a *value* into a *name* | |
We can also put this into a function | |
```{r} | |
grouper <- function(data, col) { | |
col <- sym(col) | |
data %>% | |
group_by(!!col) %>% | |
summarise(count = n()) | |
} | |
``` | |
```{r} | |
grouper(mtcars, "cyl") | |
``` | |
## Any questions? | |
### Any functions you want to see? | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment