Last active
July 1, 2023 04:56
-
-
Save jmclawson/c982b4e9763f456ab1b09e5034beb6f9 to your computer and use it in GitHub Desktop.
using sf to map from external shapefiles and from packages
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "Mapping with R" | |
output: | |
html_document: | |
df_print: paged | |
toc: true | |
toc_float: true | |
date: "2023-06-30" | |
--- | |
```{r setup, include=FALSE} | |
knitr::opts_chunk$set(echo = TRUE) | |
``` | |
## Load packages | |
The `sf` package is one common way to map with `ggplot2.` | |
```{r message=FALSE, warning=FALSE} | |
library(sf) | |
library(tidyverse) | |
``` | |
## Get shape data | |
The hardest part about mapping is to find good mapping data. This is stored in a special data type that defines borders and locations on the globe. Shape data can be imported into R, or it can be provided by packages. | |
### From files | |
The UN collects shape data in "shapefiles" for some countries here: https://salb.un.org/en/data. Most countries are lacking geospatial assets in this list. For countries with data, shapefiles are presented in compressed zip format that should be uncompressed after downloading. Then read it with `read_sf()`. | |
The resulting object will have a special column called `geometry` (*the very last column!*) of type `MULTIPOLYGON`. | |
```{r cyprus-un-data} | |
# hosted zip file from this location: https://geoportal.un.org/arcgis/sharing/rest/content/items/d50cdce398af4712b6e31042c085e5ba/data | |
# unzip it and locate the file with the SHP suffix | |
cyprus_un <- read_sf("gis/BNDA_CYP_1960-08-01_lastupdate.shp") | |
cyprus_un | |
``` | |
Additionally, it is a special kind of data frame, with `sf` as the first of its classes: | |
```{r} | |
class(cyprus_un) | |
``` | |
Any object with `sf` as its class and this special `geometry` column can be mapped simply with `ggplot()` and `geom_sf()`---no need to specify X- or Y-axis: | |
```{r cyprus-map-1} | |
cyprus_un %>% | |
ggplot() + | |
geom_sf() | |
``` | |
As with any `ggplot` figure, we can map columns to aesthetics: | |
```{r cyprus-map-2} | |
cyprus_un %>% | |
ggplot() + | |
geom_sf(aes(fill = adm1nm)) | |
``` | |
For comparison, here's a similar process with Mexico. Its legibility suffers because of too much detail. | |
```{r mexico-un} | |
# hosted zip file from this location: https://geoportal.un.org/arcgis/sharing/rest/content/items/14e7cef499aa42bdb773511d4be516b2/data | |
# unzip it and locate the file with the SHP suffix | |
mexico_un <- read_sf("gis/BNDA_MEX_2018-02-01_lastupdate.shp") | |
mexico_un %>% | |
mutate(region = fct_collapse(adm1nm, | |
Aguascalientes = "Aguascalientes", | |
Zacatecas = "Zacatecas", | |
Jalisco = "Jalisco", | |
"Ciudad de México" = "Ciudad de México", | |
other_level = "other")) %>% | |
ggplot() + | |
geom_sf(aes(fill = region)) + | |
labs(fill = NULL) + | |
theme_minimal() + | |
scale_fill_manual( | |
values = c(Aguascalientes = "orange", | |
Zacatecas = "lightblue", | |
Jalisco = "lightgreen", | |
"Ciudad de México" = "pink", | |
other = "gray")) | |
``` | |
### From a package | |
Packages that include shapefiles make it possible to avoid the task of finding and importing shapefiles manually. One such package is `rnaturalearth`. | |
As always, there will be differences in quality and methods of encoding the data. Cyprus shows less detail than the UN data, and it is divided differently. | |
```{r cyprus-ne, message=FALSE, warning=FALSE} | |
library("rnaturalearth") | |
library("rnaturalearthdata") | |
cyprus_ne <- ne_states(country = "cyprus", returnclass = "sf") | |
cyprus_northern_ne <- ne_states(country = "northern cyprus", returnclass = "sf") | |
cyprus_combined_ne <- rbind(cyprus_ne, cyprus_northern_ne) | |
cyprus_combined_ne %>% | |
ggplot() + | |
geom_sf(aes(fill = name)) + | |
theme_minimal() + | |
labs(fill = NULL) | |
``` | |
Mexico also shows some differences in the level of detail and the names. | |
```{r mexico-ne} | |
mexico_ne <- ne_states(country = "mexico", returnclass = "sf") | |
mexico_ne %>% | |
mutate(region = fct_collapse(name, | |
Aguascalientes = "Aguascalientes", | |
Zacatecas = "Zacatecas", | |
Jalisco = "Jalisco", | |
"Distrito Federal" = "Distrito Federal", | |
other_level = "other")) %>% | |
ggplot() + | |
geom_sf(aes(fill = region)) + | |
theme_minimal() + | |
scale_fill_manual( | |
values = c(Aguascalientes = "orange", | |
Zacatecas = "lightblue", | |
Jalisco = "lightgreen", | |
"Distrito Federal" = "pink", | |
other = "gray")) | |
``` | |
The biggest difference with the data from the package is that it covers the world: | |
```{r world-ne} | |
world_ne <- ne_countries(scale = "small", returnclass = "sf") | |
world_ne %>% | |
ggplot() + | |
geom_sf(aes(fill = economy)) + | |
theme_minimal() | |
world_ne %>% | |
ggplot() + | |
geom_sf(aes(fill = pop_est)) + | |
theme_minimal() + | |
scale_fill_viridis_c(option = "F", | |
labels = scales::label_comma()) | |
``` | |
## Join shape data with other sources | |
Shape data can be combined with other data to map it. Be careful when using joins with data frames of class `sf`. The order used for joins will make a difference and could remove the special `sf` class if that data frame isn't given priority in the join. | |
```{r message=FALSE, warning=FALSE} | |
j2sr <- read_csv("data/j2sr.csv") | |
j2sr_latest <- j2sr %>% | |
filter(!is.na(child_health)) %>% | |
group_by(country) %>% | |
slice_max(order_by = year, n = 1) %>% | |
ungroup() | |
sf_success <- | |
left_join(world_ne, j2sr_latest, | |
by = join_by("geounit"=="country")) | |
class(sf_success) | |
sf_fail <- | |
left_join(j2sr_latest, world_ne, | |
by = join_by("country"=="geounit")) | |
class(sf_fail) | |
``` | |
Mapping from the joined data set can provide new insights about familiar information. | |
```{r} | |
sf_success %>% | |
ggplot() + | |
geom_sf(aes(fill = child_health)) + | |
theme_minimal() | |
``` | |
Limiting the data can be a good way to explore more detail within a local context. | |
```{r} | |
ne_countries(continent = "South America", returnclass = "sf", scale = "large") %>% | |
left_join(j2sr_latest, | |
by = join_by("geounit"=="country")) %>% | |
ggplot() + | |
geom_sf(aes(fill = child_health)) + | |
theme_void() | |
``` |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment