Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save PatrickRWright/012babbb71e44089bca48122c337a628 to your computer and use it in GitHub Desktop.
Save PatrickRWright/012babbb71e44089bca48122c337a628 to your computer and use it in GitHub Desktop.
Created on Skills Network Labs
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Conditional piping in the `tidyverse`"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# lib\n",
"library(tidyverse)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is possible that you will need to conditionally in- or exclude certain parts of a `tidyverse` pipeline. This may be especially true if you are using\n",
"pipelines in functions which should be as generic as possible. The code below briefly demonstrates how this can be achieved."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's assume you would like a function which allows you to return the median 1/4 mile time (i.e. `qsec`) and Miles/(US) gallon (i.e. `mpg`)\n",
"per cylinder count (i.e. `cyl`) for the `mtcars` dataset. However, you would like to do this for the different engine shapes (i.e. `vs`) individually\n",
"and for the entire dataset. Instead of rewriting the pipeline three times with minor variations you can write a function which includes a conditional\n",
"part."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# cat will return useful results for 0, 1 or \"all\"\n",
"mtcars_averaging <- function(cat = \"all\") {\n",
" mtcars %>%\n",
" { if (cat != \"all\") filter(., vs == cat) else . } %>%\n",
" group_by(cyl) %>%\n",
" summarize(median_qsec = median(qsec),\n",
" median_mpg = median(mpg))\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<table>\n",
"<caption>A tibble: 3 × 3</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cyl</th><th scope=col>median_qsec</th><th scope=col>median_mpg</th></tr>\n",
"\t<tr><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>4</td><td>18.900</td><td>26.0</td></tr>\n",
"\t<tr><td>6</td><td>18.300</td><td>19.7</td></tr>\n",
"\t<tr><td>8</td><td>17.175</td><td>15.2</td></tr>\n",
"</tbody>\n",
"</table>\n"
],
"text/latex": [
"A tibble: 3 × 3\n",
"\\begin{tabular}{lll}\n",
" cyl & median\\_qsec & median\\_mpg\\\\\n",
" <dbl> & <dbl> & <dbl>\\\\\n",
"\\hline\n",
"\t 4 & 18.900 & 26.0\\\\\n",
"\t 6 & 18.300 & 19.7\\\\\n",
"\t 8 & 17.175 & 15.2\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A tibble: 3 × 3\n",
"\n",
"| cyl &lt;dbl&gt; | median_qsec &lt;dbl&gt; | median_mpg &lt;dbl&gt; |\n",
"|---|---|---|\n",
"| 4 | 18.900 | 26.0 |\n",
"| 6 | 18.300 | 19.7 |\n",
"| 8 | 17.175 | 15.2 |\n",
"\n"
],
"text/plain": [
" cyl median_qsec median_mpg\n",
"1 4 18.900 26.0 \n",
"2 6 18.300 19.7 \n",
"3 8 17.175 15.2 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mtcars_averaging(\"all\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<table>\n",
"<caption>A tibble: 2 × 3</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cyl</th><th scope=col>median_qsec</th><th scope=col>median_mpg</th></tr>\n",
"\t<tr><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>4</td><td>19.185</td><td>25.85</td></tr>\n",
"\t<tr><td>6</td><td>19.170</td><td>18.65</td></tr>\n",
"</tbody>\n",
"</table>\n"
],
"text/latex": [
"A tibble: 2 × 3\n",
"\\begin{tabular}{lll}\n",
" cyl & median\\_qsec & median\\_mpg\\\\\n",
" <dbl> & <dbl> & <dbl>\\\\\n",
"\\hline\n",
"\t 4 & 19.185 & 25.85\\\\\n",
"\t 6 & 19.170 & 18.65\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A tibble: 2 × 3\n",
"\n",
"| cyl &lt;dbl&gt; | median_qsec &lt;dbl&gt; | median_mpg &lt;dbl&gt; |\n",
"|---|---|---|\n",
"| 4 | 19.185 | 25.85 |\n",
"| 6 | 19.170 | 18.65 |\n",
"\n"
],
"text/plain": [
" cyl median_qsec median_mpg\n",
"1 4 19.185 25.85 \n",
"2 6 19.170 18.65 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mtcars_averaging(1)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<table>\n",
"<caption>A tibble: 3 × 3</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cyl</th><th scope=col>median_qsec</th><th scope=col>median_mpg</th></tr>\n",
"\t<tr><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>4</td><td>16.700</td><td>26.0</td></tr>\n",
"\t<tr><td>6</td><td>16.460</td><td>21.0</td></tr>\n",
"\t<tr><td>8</td><td>17.175</td><td>15.2</td></tr>\n",
"</tbody>\n",
"</table>\n"
],
"text/latex": [
"A tibble: 3 × 3\n",
"\\begin{tabular}{lll}\n",
" cyl & median\\_qsec & median\\_mpg\\\\\n",
" <dbl> & <dbl> & <dbl>\\\\\n",
"\\hline\n",
"\t 4 & 16.700 & 26.0\\\\\n",
"\t 6 & 16.460 & 21.0\\\\\n",
"\t 8 & 17.175 & 15.2\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A tibble: 3 × 3\n",
"\n",
"| cyl &lt;dbl&gt; | median_qsec &lt;dbl&gt; | median_mpg &lt;dbl&gt; |\n",
"|---|---|---|\n",
"| 4 | 16.700 | 26.0 |\n",
"| 6 | 16.460 | 21.0 |\n",
"| 8 | 17.175 | 15.2 |\n",
"\n"
],
"text/plain": [
" cyl median_qsec median_mpg\n",
"1 4 16.700 26.0 \n",
"2 6 16.460 21.0 \n",
"3 8 17.175 15.2 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mtcars_averaging(0)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"TRUE"
],
"text/latex": [
"TRUE"
],
"text/markdown": [
"TRUE"
],
"text/plain": [
"[1] TRUE"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"TRUE"
],
"text/latex": [
"TRUE"
],
"text/markdown": [
"TRUE"
],
"text/plain": [
"[1] TRUE"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"TRUE"
],
"text/latex": [
"TRUE"
],
"text/markdown": [
"TRUE"
],
"text/plain": [
"[1] TRUE"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# manual tests (should all evaluate to TRUE)\n",
"median(mtcars$mpg[which(mtcars$cyl == 6)]) == 19.7\n",
"median(mtcars$qsec[which(mtcars$vs == 1 & mtcars$cyl == 4)]) == 19.185\n",
"median(mtcars$qsec[which(mtcars$vs == 0 & mtcars$cyl == 6)]) == 16.46"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"R version 3.5.1 (2018-07-02)\n",
"Platform: x86_64-conda_cos6-linux-gnu (64-bit)\n",
"Running under: Debian GNU/Linux 10 (buster)\n",
"\n",
"Matrix products: default\n",
"BLAS/LAPACK: /home/jupyterlab/conda/envs/r/lib/R/lib/libRlapack.so\n",
"\n",
"locale:\n",
" [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 \n",
" [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 \n",
" [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C \n",
"[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C \n",
"\n",
"attached base packages:\n",
"[1] stats graphics grDevices utils datasets methods base \n",
"\n",
"other attached packages:\n",
"[1] forcats_0.5.0 stringr_1.4.0 dplyr_0.8.5 purrr_0.3.4 \n",
"[5] readr_1.3.1 tidyr_1.0.2 tibble_3.0.1 ggplot2_3.3.0 \n",
"[9] tidyverse_1.3.0\n",
"\n",
"loaded via a namespace (and not attached):\n",
" [1] pbdZMQ_0.3-3 tidyselect_1.0.0 repr_1.1.0 haven_2.2.0 \n",
" [5] lattice_0.20-41 colorspace_1.4-1 vctrs_0.2.4 generics_0.0.2 \n",
" [9] htmltools_0.4.0 base64enc_0.1-3 rlang_0.4.5 pillar_1.4.3 \n",
"[13] withr_2.2.0 glue_1.4.0 DBI_1.1.0 dbplyr_1.4.3 \n",
"[17] modelr_0.1.7 readxl_1.3.1 uuid_0.1-4 lifecycle_0.2.0 \n",
"[21] munsell_0.5.0 gtable_0.3.0 cellranger_1.1.0 rvest_0.3.5 \n",
"[25] evaluate_0.14 fansi_0.4.1 broom_0.5.6 IRdisplay_0.7.0 \n",
"[29] Rcpp_1.0.4.6 backports_1.1.6 scales_1.1.0 IRkernel_0.8.12 \n",
"[33] jsonlite_1.6.1 fs_1.4.1 hms_0.5.3 digest_0.6.25 \n",
"[37] stringi_1.4.6 grid_3.5.1 cli_2.0.2 tools_3.5.1 \n",
"[41] magrittr_1.5 crayon_1.3.4 pkgconfig_2.0.3 ellipsis_0.3.0 \n",
"[45] xml2_1.3.2 reprex_0.3.0 lubridate_1.7.8 assertthat_0.2.1\n",
"[49] httr_1.4.1 rstudioapi_0.11 R6_2.4.1 nlme_3.1-147 \n",
"[53] compiler_3.5.1 "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sessionInfo()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "R",
"language": "R",
"name": "conda-env-r-r"
},
"language_info": {
"codemirror_mode": "r",
"file_extension": ".r",
"mimetype": "text/x-r-source",
"name": "R",
"pygments_lexer": "r",
"version": "3.5.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment