Skip to content

Instantly share code, notes, and snippets.

@dalejbarr
Created September 9, 2016 13:42
Show Gist options
  • Save dalejbarr/8c41e5c1addeaf2dfcf271819c5d4174 to your computer and use it in GitHub Desktop.
Save dalejbarr/8c41e5c1addeaf2dfcf271819c5d4174 to your computer and use it in GitHub Desktop.
RMarkdown document for generating a report on scottish babynames
---
title: "Scottish Babynames"
author: "Dale Barr"
date: "7 September 2016"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Background
This report presents an analysis of the historical trends in Scottish babynames for the top 5 boy and girl names for babies born in 1998.
The dataset is from <http://www.nrscotland.gov.uk/statistics-and-data/statistics/statistics-by-theme/vital-events/names/babies-first-names/babies-first-names-summary-records-comma-separated-value-csv-format>
## Setting the scene for the analysis
First, we are going to load in the add-on packages that we need.
```{r}
library(dplyr)
library(readr)
library(ggplot2)
library(cowsay)
```
Now, let's load in the data. Note that we deleted a stray line at the end of the file with copyright information because it was giving us a parsing error.
```{r}
dat <- read_csv("babies-first-names-all-names-all-years2.csv")
```
Let's pull out the top 5 boy and girl names for babies born in 1998.
```{r}
dat98 <- filter(dat, yr == 1998) %>%
group_by(sex) %>%
filter(rank <= 5)
```
OK, next let's get the full historical information for these 10 names:
```{r}
top10 <- semi_join(dat, dat98 %>% select(sex, FirstForename),
c("sex", "FirstForename"))
```
## Finally, let's make the plot!
```{r}
ggplot(top10, aes(yr, number, colour = FirstForename)) +
geom_line() + facet_wrap(~sex)
```
## Celebration!
```{r}
say("#w00t! We did it!", "signbunny")
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment