Created
July 2, 2020 12:55
-
-
Save PatrickRWright/012babbb71e44089bca48122c337a628 to your computer and use it in GitHub Desktop.
Created on Skills Network Labs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Conditional piping in the `tidyverse`" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# lib\n", | |
"library(tidyverse)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"It is possible that you will need to conditionally in- or exclude certain parts of a `tidyverse` pipeline. This may be especially true if you are using\n", | |
"pipelines in functions which should be as generic as possible. The code below briefly demonstrates how this can be achieved." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Let's assume you would like a function which allows you to return the median 1/4 mile time (i.e. `qsec`) and Miles/(US) gallon (i.e. `mpg`)\n", | |
"per cylinder count (i.e. `cyl`) for the `mtcars` dataset. However, you would like to do this for the different engine shapes (i.e. `vs`) individually\n", | |
"and for the entire dataset. Instead of rewriting the pipeline three times with minor variations you can write a function which includes a conditional\n", | |
"part." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# cat will return useful results for 0, 1 or \"all\"\n", | |
"mtcars_averaging <- function(cat = \"all\") {\n", | |
" mtcars %>%\n", | |
" { if (cat != \"all\") filter(., vs == cat) else . } %>%\n", | |
" group_by(cyl) %>%\n", | |
" summarize(median_qsec = median(qsec),\n", | |
" median_mpg = median(mpg))\n", | |
"}" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
"<caption>A tibble: 3 × 3</caption>\n", | |
"<thead>\n", | |
"\t<tr><th scope=col>cyl</th><th scope=col>median_qsec</th><th scope=col>median_mpg</th></tr>\n", | |
"\t<tr><th scope=col><dbl></th><th scope=col><dbl></th><th scope=col><dbl></th></tr>\n", | |
"</thead>\n", | |
"<tbody>\n", | |
"\t<tr><td>4</td><td>18.900</td><td>26.0</td></tr>\n", | |
"\t<tr><td>6</td><td>18.300</td><td>19.7</td></tr>\n", | |
"\t<tr><td>8</td><td>17.175</td><td>15.2</td></tr>\n", | |
"</tbody>\n", | |
"</table>\n" | |
], | |
"text/latex": [ | |
"A tibble: 3 × 3\n", | |
"\\begin{tabular}{lll}\n", | |
" cyl & median\\_qsec & median\\_mpg\\\\\n", | |
" <dbl> & <dbl> & <dbl>\\\\\n", | |
"\\hline\n", | |
"\t 4 & 18.900 & 26.0\\\\\n", | |
"\t 6 & 18.300 & 19.7\\\\\n", | |
"\t 8 & 17.175 & 15.2\\\\\n", | |
"\\end{tabular}\n" | |
], | |
"text/markdown": [ | |
"\n", | |
"A tibble: 3 × 3\n", | |
"\n", | |
"| cyl <dbl> | median_qsec <dbl> | median_mpg <dbl> |\n", | |
"|---|---|---|\n", | |
"| 4 | 18.900 | 26.0 |\n", | |
"| 6 | 18.300 | 19.7 |\n", | |
"| 8 | 17.175 | 15.2 |\n", | |
"\n" | |
], | |
"text/plain": [ | |
" cyl median_qsec median_mpg\n", | |
"1 4 18.900 26.0 \n", | |
"2 6 18.300 19.7 \n", | |
"3 8 17.175 15.2 " | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"mtcars_averaging(\"all\")" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
"<caption>A tibble: 2 × 3</caption>\n", | |
"<thead>\n", | |
"\t<tr><th scope=col>cyl</th><th scope=col>median_qsec</th><th scope=col>median_mpg</th></tr>\n", | |
"\t<tr><th scope=col><dbl></th><th scope=col><dbl></th><th scope=col><dbl></th></tr>\n", | |
"</thead>\n", | |
"<tbody>\n", | |
"\t<tr><td>4</td><td>19.185</td><td>25.85</td></tr>\n", | |
"\t<tr><td>6</td><td>19.170</td><td>18.65</td></tr>\n", | |
"</tbody>\n", | |
"</table>\n" | |
], | |
"text/latex": [ | |
"A tibble: 2 × 3\n", | |
"\\begin{tabular}{lll}\n", | |
" cyl & median\\_qsec & median\\_mpg\\\\\n", | |
" <dbl> & <dbl> & <dbl>\\\\\n", | |
"\\hline\n", | |
"\t 4 & 19.185 & 25.85\\\\\n", | |
"\t 6 & 19.170 & 18.65\\\\\n", | |
"\\end{tabular}\n" | |
], | |
"text/markdown": [ | |
"\n", | |
"A tibble: 2 × 3\n", | |
"\n", | |
"| cyl <dbl> | median_qsec <dbl> | median_mpg <dbl> |\n", | |
"|---|---|---|\n", | |
"| 4 | 19.185 | 25.85 |\n", | |
"| 6 | 19.170 | 18.65 |\n", | |
"\n" | |
], | |
"text/plain": [ | |
" cyl median_qsec median_mpg\n", | |
"1 4 19.185 25.85 \n", | |
"2 6 19.170 18.65 " | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"mtcars_averaging(1)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<table>\n", | |
"<caption>A tibble: 3 × 3</caption>\n", | |
"<thead>\n", | |
"\t<tr><th scope=col>cyl</th><th scope=col>median_qsec</th><th scope=col>median_mpg</th></tr>\n", | |
"\t<tr><th scope=col><dbl></th><th scope=col><dbl></th><th scope=col><dbl></th></tr>\n", | |
"</thead>\n", | |
"<tbody>\n", | |
"\t<tr><td>4</td><td>16.700</td><td>26.0</td></tr>\n", | |
"\t<tr><td>6</td><td>16.460</td><td>21.0</td></tr>\n", | |
"\t<tr><td>8</td><td>17.175</td><td>15.2</td></tr>\n", | |
"</tbody>\n", | |
"</table>\n" | |
], | |
"text/latex": [ | |
"A tibble: 3 × 3\n", | |
"\\begin{tabular}{lll}\n", | |
" cyl & median\\_qsec & median\\_mpg\\\\\n", | |
" <dbl> & <dbl> & <dbl>\\\\\n", | |
"\\hline\n", | |
"\t 4 & 16.700 & 26.0\\\\\n", | |
"\t 6 & 16.460 & 21.0\\\\\n", | |
"\t 8 & 17.175 & 15.2\\\\\n", | |
"\\end{tabular}\n" | |
], | |
"text/markdown": [ | |
"\n", | |
"A tibble: 3 × 3\n", | |
"\n", | |
"| cyl <dbl> | median_qsec <dbl> | median_mpg <dbl> |\n", | |
"|---|---|---|\n", | |
"| 4 | 16.700 | 26.0 |\n", | |
"| 6 | 16.460 | 21.0 |\n", | |
"| 8 | 17.175 | 15.2 |\n", | |
"\n" | |
], | |
"text/plain": [ | |
" cyl median_qsec median_mpg\n", | |
"1 4 16.700 26.0 \n", | |
"2 6 16.460 21.0 \n", | |
"3 8 17.175 15.2 " | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"mtcars_averaging(0)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"TRUE" | |
], | |
"text/latex": [ | |
"TRUE" | |
], | |
"text/markdown": [ | |
"TRUE" | |
], | |
"text/plain": [ | |
"[1] TRUE" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"TRUE" | |
], | |
"text/latex": [ | |
"TRUE" | |
], | |
"text/markdown": [ | |
"TRUE" | |
], | |
"text/plain": [ | |
"[1] TRUE" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
}, | |
{ | |
"data": { | |
"text/html": [ | |
"TRUE" | |
], | |
"text/latex": [ | |
"TRUE" | |
], | |
"text/markdown": [ | |
"TRUE" | |
], | |
"text/plain": [ | |
"[1] TRUE" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"# manual tests (should all evaluate to TRUE)\n", | |
"median(mtcars$mpg[which(mtcars$cyl == 6)]) == 19.7\n", | |
"median(mtcars$qsec[which(mtcars$vs == 1 & mtcars$cyl == 4)]) == 19.185\n", | |
"median(mtcars$qsec[which(mtcars$vs == 0 & mtcars$cyl == 6)]) == 16.46" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"R version 3.5.1 (2018-07-02)\n", | |
"Platform: x86_64-conda_cos6-linux-gnu (64-bit)\n", | |
"Running under: Debian GNU/Linux 10 (buster)\n", | |
"\n", | |
"Matrix products: default\n", | |
"BLAS/LAPACK: /home/jupyterlab/conda/envs/r/lib/R/lib/libRlapack.so\n", | |
"\n", | |
"locale:\n", | |
" [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 \n", | |
" [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 \n", | |
" [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C \n", | |
"[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C \n", | |
"\n", | |
"attached base packages:\n", | |
"[1] stats graphics grDevices utils datasets methods base \n", | |
"\n", | |
"other attached packages:\n", | |
"[1] forcats_0.5.0 stringr_1.4.0 dplyr_0.8.5 purrr_0.3.4 \n", | |
"[5] readr_1.3.1 tidyr_1.0.2 tibble_3.0.1 ggplot2_3.3.0 \n", | |
"[9] tidyverse_1.3.0\n", | |
"\n", | |
"loaded via a namespace (and not attached):\n", | |
" [1] pbdZMQ_0.3-3 tidyselect_1.0.0 repr_1.1.0 haven_2.2.0 \n", | |
" [5] lattice_0.20-41 colorspace_1.4-1 vctrs_0.2.4 generics_0.0.2 \n", | |
" [9] htmltools_0.4.0 base64enc_0.1-3 rlang_0.4.5 pillar_1.4.3 \n", | |
"[13] withr_2.2.0 glue_1.4.0 DBI_1.1.0 dbplyr_1.4.3 \n", | |
"[17] modelr_0.1.7 readxl_1.3.1 uuid_0.1-4 lifecycle_0.2.0 \n", | |
"[21] munsell_0.5.0 gtable_0.3.0 cellranger_1.1.0 rvest_0.3.5 \n", | |
"[25] evaluate_0.14 fansi_0.4.1 broom_0.5.6 IRdisplay_0.7.0 \n", | |
"[29] Rcpp_1.0.4.6 backports_1.1.6 scales_1.1.0 IRkernel_0.8.12 \n", | |
"[33] jsonlite_1.6.1 fs_1.4.1 hms_0.5.3 digest_0.6.25 \n", | |
"[37] stringi_1.4.6 grid_3.5.1 cli_2.0.2 tools_3.5.1 \n", | |
"[41] magrittr_1.5 crayon_1.3.4 pkgconfig_2.0.3 ellipsis_0.3.0 \n", | |
"[45] xml2_1.3.2 reprex_0.3.0 lubridate_1.7.8 assertthat_0.2.1\n", | |
"[49] httr_1.4.1 rstudioapi_0.11 R6_2.4.1 nlme_3.1-147 \n", | |
"[53] compiler_3.5.1 " | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"sessionInfo()" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "R", | |
"language": "R", | |
"name": "conda-env-r-r" | |
}, | |
"language_info": { | |
"codemirror_mode": "r", | |
"file_extension": ".r", | |
"mimetype": "text/x-r-source", | |
"name": "R", | |
"pygments_lexer": "r", | |
"version": "3.5.1" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment