This gist shows how to create an animated world map of life expectancy using R. The data comes the UN World Population Prospects, 2015 Revision, and it brings life expectancy data from 1950 untill 2015 and projeceted data up to 2100. Thanks Topi Tjukanov, who reminded me of the UN DESA data portal, where you can find this dataset and many others
The idea is to use open data to create a GIF, much like the ones created by Aron Strandberg but his maps look much nicer. The output of this script is a map like this one:
Now diving into the code. First, let's load the necessary libraries and get the data. As of this writing, the current version of gganimate
has a bug that messes the aesthetics of the .gif
file. As a temporary solution, I've intalled an older version of the package, as recommended by the author of the gganimate
, David Robinson.
# load libraries
library(curl)
library(readxl)
library(data.table)
library(rworldmap)
library(ggplot2)
library(dplyr)
library(tweenr)
library(ggthemes)
library(viridis)
library(rgeos)
library(countrycode)
library(devtools)
install_github("dgrtwo/gganimate", ref = "26ec501")
library(gganimate)
# DATA: Download data
URL <- "https://esa.un.org/unpd/wpp/DVD/Files/1_Indicators%20(Standard)/EXCEL_FILES/3_Mortality/WPP2015_MORT_F07_1_LIFE_EXPECTANCY_0_BOTH_SEXES.XLS"
curl_download(url=URL, destfile="LIFE_EXPECTANCY_0_BOTH_SEXES.xls", quiet=FALSE, mode="wb")
The dataset comes in an Excel file. One spreadsheet brings life expectancy data for years before 2015 and another spreadsheet brings projected life expectancy up to 2100.
Now we need to read and merge these two datasets. We are also going to use country codes (iso3) instead of country names names to amke it easier latter to join the life expect. info with spatial data. This can be done using the countrycode
library, recommended by Noam Ross
# Read historic data
df1 <- read_excel( path= "LIFE_EXPECTANCY_0_BOTH_SEXES.xls", sheet="ESTIMATES", skip = 16)
# Read projected data
df2 <- read_excel( path= "LIFE_EXPECTANCY_0_BOTH_SEXES.xls", sheet="MEDIUM VARIANT", skip = 16)
# Merge historic and projected data
setDT(df2)[, Notes := NULL][, Variant := NULL]
df <- left_join(df1, df2, by=c("Index", "Major area, region, country or area *", "Country code"))
setDT(df)
# change name of column
colnames(df)[3] <- c("country")
# drop first 14 rows with information aggregated for continents etc
df <- df[-c(1:14),]
# get standard country codes for latter merge with spatial data
df[, country_iso3c := countrycode(country, 'country.name', 'iso3c')]
# quick note: these UN data don't bring life expectancy data for Greenland, Kosovo, San Marino, Somaliland and Taiwan.
Note that the data is organized in wide format, where each year is a separete column. In order to make our animated map, we are going to reshape the data to long format, piling up all observations. melt{data.table}
makes this task really simple and fast.
# data in wide format
head(df)
#> Index Variant country Notes Country code 1950-1955 1955-1960 1960-1965 ...
#> 1: 15 Estimates Burundi NA 108 39.031 40.529 42.036 ...
#> 2: 16 Estimates Comoros NA 174 38.720 40.465 42.472 ...
#> 3: 17 Estimates Djibouti NA 262 41.038 42.949 45.176 ...
#> 4: 18 Estimates Eritrea NA 232 35.840 36.846 37.983 ...
#> 5: 19 Estimates Ethiopia NA 231 34.080 36.675 40.080 ...
# Reshape to long format
# get name of columns with years of reference
year_cols <- colnames(df)[6:35]
# Reshape data
dt <- melt(df, id.vars = c("country", "country_iso3c"),
measure.vars = year_cols,
variable.name= "year",
value.name= "life_expect")
# data in long format
head(dt)
#> country country_iso3c year life_expect
#> 1: Burundi BDI 1950-1955 39.031
#> 2: Comoros COM 1950-1955 38.720
#> 3: Djibouti DJI 1950-1955 41.038
#> 4: Eritrea ERI 1950-1955 35.840
#> 5: Ethiopia ETH 1950-1955 34.080
#> 6: ... ... ... ...
# get Min and Max values of life expectancy
vmax <- max(dt$life_expect, na.rm=T)
vmin <- min(dt$life_expect, na.rm=T)
# get world map
wmap <- getMap(resolution="low")
# small edits
wmap <- spTransform(wmap, CRS("+proj=robin")) # reproject
wmap <- subset(wmap, !(NAME %like% "Antar")) # Remove Antarctica
We also need to get a pair of coordinates for each country, which will indicate on the map where we will place the tex with life expectancy values. I adopted a simple solution using the centroids of each country. This is far from perfct because some strings will look a bit misaligned. See for example Norway, Canada, Peru or Chile. If you know a simple way to fix this, I would be glad to hear from you!
# get centroids of countries
centroids <- gCentroid( wmap , byid=TRUE, id = wmap@data$ISO3)
centroids <- data.frame(centroids)
setDT(centroids, keep.rownames = TRUE)[]
setnames(centroids, "rn", "country_iso3c")
Now we just need to join the life expectancy data with the world map, and...
# join data to map
wmap_df <- fortify(wmap, region = "ISO3")
wmap_df <- left_join(wmap_df, dt, by = c('id'='country_iso3c')) # data
wmap_df <- left_join(wmap_df, centroids, by = c('id'='country_iso3c')) # centroids
# plot
o <- ggplot(data=wmap_df) +
geom_polygon(aes(x = long, y = lat, group = group, fill=life_expect, frame = year), color="gray90") +
geom_text(aes(x = x, y = y, label = round(life_expect), frame = year), hjust=0, vjust=0, size = 4.5) +
scale_fill_viridis(name="Life Expectancy", begin = 0, end = 1, limits = c(vmin,vmax), na.value="gray99") +
theme_void() +
guides(fill = guide_colorbar(title.position = "top")) +
labs(title = "Life Expectancy, ") +
labs(caption = "Map by Rafael H M Pereira, @UrbanDemog\nsource: UN World Population Prospects 2015 Revision") +
theme(plot.title = element_text(hjust = 0.5, vjust = 0.05, size=25)) +
theme(plot.caption = element_text(hjust = 0, color="gray40", size=15)) +
coord_cartesian(xlim = c(-11807982, 14807978)) +
theme( legend.position = c(.5, .08),
legend.direction = "horizontal",
legend.title.align = 0,
legend.key.size = unit(1.3, "cm"),
legend.title=element_text(size=17),
legend.text=element_text(size=13) )
# save gif
gg_animate(o, "output4020_old.gif", title_frame =T,
ani.width=1600, ani.height=820, dpi=800, interval = .4)
As you can see, the plot is still a bit clunky with too many numbers, poor resolution, and legend too small. I will be updating this code in the future as I improve the map. If you have any suggestion to improve the map or the code, please feel free to get in touch!
You can set number of digits to show. You just need to run this line of code at the beginning of your script:
options(digits=0)