Last active
March 24, 2021 15:19
-
-
Save kjhealy/d8760aed7f7b23ec57d428c827db2376 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# UK map v. quick example | |
## Libraries | |
library(tidyverse) | |
library(sf) | |
#> Linking to GEOS 3.8.1, GDAL 3.1.4, PROJ 6.3.1 | |
## Get the map data | |
## An authoritative source for UK map files is the [ONS Open Geography Portal](https://geoportal.statistics.gov.uk). | |
# https://geoportal.statistics.gov.uk/search?collection=Dataset&sort=name&tags=all(BDY_LAD) | |
# Look under the `Data` tab for the link to the geojson file. We are going to directly grab a boundary file of UK | |
# local authority areas: | |
uk_lads <- read_sf("https://opendata.arcgis.com/datasets/69cd46d7d2664e02b30c2f8dcc2bfaf7_0.geojson") | |
uk_lads | |
#> Simple feature collection with 382 features and 10 fields | |
#> geometry type: MULTIPOLYGON | |
#> dimension: XY | |
#> bbox: xmin: -8.649996 ymin: 49.88234 xmax: 1.763571 ymax: 60.84575 | |
#> geographic CRS: WGS 84 | |
#> # A tibble: 382 x 11 | |
#> OBJECTID LAD19CD LAD19NM LAD19NMW BNG_E BNG_N LONG LAT Shape__Area | |
#> <int> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> | |
#> 1 1 E060000… Hartlepool " " 447157 531476 -1.27 54.7 96512311. | |
#> 2 2 E060000… Middlesbro… " " 451141 516887 -1.21 54.5 55229150. | |
#> 3 3 E060000… Redcar and… " " 464359 519597 -1.01 54.6 248409004. | |
#> 4 4 E060000… Stockton-o… " " 444937 518183 -1.31 54.6 205231500. | |
#> 5 5 E060000… Darlington " " 428029 515648 -1.57 54.5 198812771. | |
#> 6 6 E060000… Halton " " 354246 382146 -2.69 53.3 82869023. | |
#> 7 7 E060000… Warrington " " 362744 388456 -2.56 53.4 178742907. | |
#> 8 8 E060000… Blackburn … " " 369490 422806 -2.46 53.7 139386373. | |
#> 9 9 E060000… Blackpool " " 332763 436633 -3.02 53.8 33677587. | |
#> 10 10 E060000… Kingston u… " " 511894 431716 -0.304 53.8 70882300. | |
#> # … with 372 more rows, and 2 more variables: Shape__Length <dbl>, | |
#> # geometry <MULTIPOLYGON [°]> | |
## Get some sample COVID data from https://coronavirus.data.gov.uk | |
# https://api.coronavirus.data.gov.uk/v2/data?areaType=utla&metric=cumDeaths60DaysByDeathDate&format=csv | |
covid <- read_csv("https://api.coronavirus.data.gov.uk/v2/data?areaType=utla&metric=cumDeaths60DaysByDeathDate&format=csv") | |
#> ── Column specification ──────────────────────────────────────────────────────── | |
#> cols( | |
#> date = col_date(format = ""), | |
#> areaType = col_character(), | |
#> areaCode = col_character(), | |
#> areaName = col_character(), | |
#> cumDeaths60DaysByDeathDate = col_double() | |
#> ) | |
covid | |
#> # A tibble: 55,239 x 5 | |
#> date areaType areaCode areaName cumDeaths60DaysByDeathDate | |
#> <date> <chr> <chr> <chr> <dbl> | |
#> 1 2021-03-22 utla E10000007 Derbyshire 2084 | |
#> 2 2021-03-21 utla E10000007 Derbyshire 2084 | |
#> 3 2021-03-20 utla E10000007 Derbyshire 2083 | |
#> 4 2021-03-19 utla E10000007 Derbyshire 2080 | |
#> 5 2021-03-18 utla E10000007 Derbyshire 2078 | |
#> 6 2021-03-17 utla E10000007 Derbyshire 2075 | |
#> 7 2021-03-16 utla E10000007 Derbyshire 2074 | |
#> 8 2021-03-15 utla E10000007 Derbyshire 2072 | |
#> 9 2021-03-14 utla E10000007 Derbyshire 2068 | |
#> 10 2021-03-22 utla E08000002 Bury 587 | |
#> # … with 55,229 more rows | |
### We rename the `areaCode` to `LAD19CD` so we can join the data to the map file in a moment. | |
### Then we filter the table so we just look at a single day; i.e. just one observation per LAD. | |
covid_sm <- covid %>% | |
rename(LAD19CD = areaCode) %>% | |
filter(date == "2021-03-17") | |
covid_sm | |
#> # A tibble: 149 x 5 | |
#> date areaType LAD19CD areaName cumDeaths60DaysByDeathDate | |
#> <date> <chr> <chr> <chr> <dbl> | |
#> 1 2021-03-17 utla E10000007 Derbyshire 2075 | |
#> 2 2021-03-17 utla E08000002 Bury 585 | |
#> 3 2021-03-17 utla E10000019 Lincolnshire 1865 | |
#> 4 2021-03-17 utla E08000016 Barnsley 903 | |
#> 5 2021-03-17 utla E08000008 Tameside 757 | |
#> 6 2021-03-17 utla E08000019 Sheffield 1288 | |
#> 7 2021-03-17 utla E06000030 Swindon 303 | |
#> 8 2021-03-17 utla E06000033 Southend-on-Sea 694 | |
#> 9 2021-03-17 utla E10000014 Hampshire 2695 | |
#> 10 2021-03-17 utla E10000031 Warwickshire 1230 | |
#> # … with 139 more rows | |
### Looks like we have 149 unique places in this data. So there will be a bunch of missing LADs | |
### in the map. | |
### We merge this with our map data: | |
uk_lads_covid <- uk_lads %>% | |
left_join(covid_sm, by = "LAD19CD") | |
# looks OK | |
uk_lads_covid %>% | |
select(LAD19CD, LAD19NM, geometry, cumDeaths60DaysByDeathDate) | |
#> Simple feature collection with 382 features and 3 fields | |
#> geometry type: MULTIPOLYGON | |
#> dimension: XY | |
#> bbox: xmin: -8.649996 ymin: 49.88234 xmax: 1.763571 ymax: 60.84575 | |
#> geographic CRS: WGS 84 | |
#> # A tibble: 382 x 4 | |
#> LAD19CD LAD19NM geometry cumDeaths60DaysBy… | |
#> <chr> <chr> <MULTIPOLYGON [°]> <dbl> | |
#> 1 E060000… Hartlepool (((-1.177633 54.69919, -1.173981 54… 287 | |
#> 2 E060000… Middlesbrou… (((-1.282626 54.56528, -1.262559 54… 393 | |
#> 3 E060000… Redcar and … (((-1.149131 54.61433, -1.154624 54… 328 | |
#> 4 E060000… Stockton-on… (((-1.282626 54.56528, -1.270612 54… 512 | |
#> 5 E060000… Darlington (((-1.696926 54.53601, -1.705274 54… 296 | |
#> 6 E060000… Halton (((-2.674641 53.35366, -2.630622 53… 304 | |
#> 7 E060000… Warrington (((-2.576743 53.44606, -2.57039 53.… 548 | |
#> 8 E060000… Blackburn w… (((-2.551298 53.75639, -2.465808 53… 455 | |
#> 9 E060000… Blackpool (((-3.04795 53.87573, -3.01975 53.8… 515 | |
#> 10 E060000… Kingston up… (((-0.2414035 53.75491, -0.2516817 … 714 | |
#> # … with 372 more rows | |
### Now we can draw a rough and ready map | |
uk_lads_covid %>% | |
ggplot(mapping = aes(fill = cumDeaths60DaysByDeathDate)) + | |
geom_sf() + | |
labs(fill = "60-Day COVID") + | |
theme_void() + | |
theme(legend.position = "top") | |
## The basic sequence will be the same always: | |
## 1. Find the boundaries you want and get the map file for them. Import that with read_sf() | |
## 2. Find the data you want. Make sure it is at the same level of observation as your map. | |
## 3. Make sure there is a column you can merge/join on---usually the unique official id of the spatial unit. | |
## 4. Join the tables. Usually this will be a left_join(), with the map data on the left and the the data of interest joined to it | |
## 5. Now you have a table with spatial data and your measures of interest as columns. | |
## 5. Draw your map with the `fill` aesthetic assigned to your statistic of interest. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment