Skip to content

Instantly share code, notes, and snippets.

@PatrickRWright
Created April 30, 2020 12:54
Show Gist options
  • Save PatrickRWright/7f1a39dff7ddb1e20de487ae1b70282d to your computer and use it in GitHub Desktop.
Save PatrickRWright/7f1a39dff7ddb1e20de487ae1b70282d to your computer and use it in GitHub Desktop.
Created on Skills Network Labs
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Calculate a summary table with percentages per group"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"R automatically includes the dataset `mtcars`. From this we will select the cylinders (`cyl`) and gears (`gear`) as grouping categories and produce a summary table which includes counts and percentages. For this we will need to load the `tidyverse`."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"library(tidyverse)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First you select which variables to `group_by`. Then `tally` created the summary total numbers `n`. Finally, `mutate` will create a new column for the percentages of the groups."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<table>\n",
"<caption>A grouped_df: 8 × 4</caption>\n",
"<thead>\n",
"\t<tr><th scope=col>cyl</th><th scope=col>gear</th><th scope=col>n</th><th scope=col>percent</th></tr>\n",
"\t<tr><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;dbl&gt;</th><th scope=col>&lt;int&gt;</th><th scope=col>&lt;chr&gt;</th></tr>\n",
"</thead>\n",
"<tbody>\n",
"\t<tr><td>4</td><td>3</td><td> 1</td><td>9.09 % </td></tr>\n",
"\t<tr><td>4</td><td>4</td><td> 8</td><td>72.73 %</td></tr>\n",
"\t<tr><td>4</td><td>5</td><td> 2</td><td>18.18 %</td></tr>\n",
"\t<tr><td>6</td><td>3</td><td> 2</td><td>28.57 %</td></tr>\n",
"\t<tr><td>6</td><td>4</td><td> 4</td><td>57.14 %</td></tr>\n",
"\t<tr><td>6</td><td>5</td><td> 1</td><td>14.29 %</td></tr>\n",
"\t<tr><td>8</td><td>3</td><td>12</td><td>85.71 %</td></tr>\n",
"\t<tr><td>8</td><td>5</td><td> 2</td><td>14.29 %</td></tr>\n",
"</tbody>\n",
"</table>\n"
],
"text/latex": [
"A grouped\\_df: 8 × 4\n",
"\\begin{tabular}{llll}\n",
" cyl & gear & n & percent\\\\\n",
" <dbl> & <dbl> & <int> & <chr>\\\\\n",
"\\hline\n",
"\t 4 & 3 & 1 & 9.09 \\% \\\\\n",
"\t 4 & 4 & 8 & 72.73 \\%\\\\\n",
"\t 4 & 5 & 2 & 18.18 \\%\\\\\n",
"\t 6 & 3 & 2 & 28.57 \\%\\\\\n",
"\t 6 & 4 & 4 & 57.14 \\%\\\\\n",
"\t 6 & 5 & 1 & 14.29 \\%\\\\\n",
"\t 8 & 3 & 12 & 85.71 \\%\\\\\n",
"\t 8 & 5 & 2 & 14.29 \\%\\\\\n",
"\\end{tabular}\n"
],
"text/markdown": [
"\n",
"A grouped_df: 8 × 4\n",
"\n",
"| cyl &lt;dbl&gt; | gear &lt;dbl&gt; | n &lt;int&gt; | percent &lt;chr&gt; |\n",
"|---|---|---|---|\n",
"| 4 | 3 | 1 | 9.09 % |\n",
"| 4 | 4 | 8 | 72.73 % |\n",
"| 4 | 5 | 2 | 18.18 % |\n",
"| 6 | 3 | 2 | 28.57 % |\n",
"| 6 | 4 | 4 | 57.14 % |\n",
"| 6 | 5 | 1 | 14.29 % |\n",
"| 8 | 3 | 12 | 85.71 % |\n",
"| 8 | 5 | 2 | 14.29 % |\n",
"\n"
],
"text/plain": [
" cyl gear n percent\n",
"1 4 3 1 9.09 % \n",
"2 4 4 8 72.73 %\n",
"3 4 5 2 18.18 %\n",
"4 6 3 2 28.57 %\n",
"5 6 4 4 57.14 %\n",
"6 6 5 1 14.29 %\n",
"7 8 3 12 85.71 %\n",
"8 8 5 2 14.29 %"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mtcars %>%\n",
" group_by(cyl, gear) %>%\n",
" tally() %>%\n",
" mutate(percent = paste(round((n/sum(n)) * 100, digits = 2), \"%\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Everything in the percentage column should add up to `3` since there are three overall gear groups. `pull` returns the last column (i.e. the `percentage` in this case)."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"3"
],
"text/latex": [
"3"
],
"text/markdown": [
"3"
],
"text/plain": [
"[1] 3"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"mtcars %>% group_by(cyl, gear) %>% tally() %>% mutate(percentage = n/sum(n)) %>% pull() %>% sum()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "R",
"language": "R",
"name": "conda-env-r-r"
},
"language_info": {
"codemirror_mode": "r",
"file_extension": ".r",
"mimetype": "text/x-r-source",
"name": "R",
"pygments_lexer": "r",
"version": "3.5.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment