Created
September 1, 2016 15:38
-
-
Save cvitolo/aa3bc6f08a8394f653442e276568f9b3 to your computer and use it in GitHub Desktop.
Geoprocessing based on user-defined areas using the rnrfa package
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "NRFA stations in EU-NUTS1 regions" | |
author: "Claudia Vitolo" | |
date: "13 March 2016" | |
output: html_document | |
--- | |
```{r setup, include=FALSE} | |
knitr::opts_chunk$set(echo = FALSE, | |
message = FALSE, | |
cache = FALSE, | |
warning = FALSE) | |
``` | |
The Nomenclature of Territorial Units for Statistics (NUTS), is a standard for | |
referencing the administrative divisions of European countries. There are three | |
levels of NUTS defined. The United Kingdom is divided, at the first level of | |
NUTS, into 12 regions. | |
It is possible to identify which NRFA station falls within each region using | |
simple geo-processing operations. The procedure makes use of the __catalogue()__ | |
function to get the list of stations and the __rgdal__ package to identify the | |
name of the region each station falls within. Finally, a new column, containing | |
the name of the region, is appended to the list of stations. The resulting | |
object is also available in the data folder of the __rnrfa__ package, under the | |
name __stationSummary__. | |
```{r regions} | |
library(rgdal) | |
# Download Eurostats - Administrative units from: | |
# http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2013_01M_SH.zip | |
temp <- tempfile() | |
download.file("http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2013_01M_SH.zip",temp) | |
unzip(temp) | |
# Read shapefiles and set CRS | |
NUTS1regions <- readOGR(dsn = "NUTS_2013_01M_SH/data", | |
layer = "NUTS_RG_01M_2013", | |
verbose=FALSE) | |
rows <- which(NUTS1regions@data$STAT_LEVL_ == 1 & | |
(substr(NUTS1regions@data$NUTS_ID, 1, 2) == "UK")) | |
NUTS1regions <- NUTS1regions[rows,] | |
NUTS1regions <- spTransform(NUTS1regions, CRS("+init=epsg:4326")) | |
# Lookup table for names and IDs | |
df <- data.frame('NUTS_ID' = c('UKC', 'UKD', 'UKE', 'UKF', | |
'UKG', 'UKH', 'UKI', 'UKJ', 'UKK', | |
'UKL', 'UKM', 'UKN'), | |
'Region' = c('North East (England)', 'North West (England)', | |
'Yorkshire and the Humber', 'East Midlands', | |
'West Midlands', 'East of England', 'Greater London', | |
'South East (England)', 'South West (England)', | |
'Wales', 'Scotland', 'Northern Ireland')) | |
# Join the two tables | |
library(dplyr) | |
NUTS1regions@data <- NUTS1regions@data %>% left_join(df) | |
# Load stations and set CRS | |
library(rnrfa) | |
stations <- catalogue() | |
coordinates(stations) <- ~lon + lat | |
proj4string(stations) <- CRS("+init=epsg:4326") | |
# Add empty ID column | |
stations@data$NUTS_ID <- NA | |
# Populate the ID column | |
for (id in NUTS1regions@data$NUTS_ID){ | |
temp <- stations %over% NUTS1regions[NUTS1regions@data$NUTS_ID == id,] | |
rows <- which(!is.na(temp$NUTS_ID)) | |
stations@data$NUTS_ID[rows] <- id | |
} | |
# Join stations and df to add the Region column | |
stations@data <- stations@data %>% left_join(df) | |
# Compute number of recorded years | |
stations$RecordedYears <- stations$gdfEnd - stations$gdfStart | |
stations <- as.data.frame(stations) | |
``` | |
The table below summarises the number of stations per region, the area of each | |
region (in Km2), and the density of stations (number of stations/Km2). | |
```{r density} | |
df$`# stations` <- lapply(df$Region, | |
function(x) length(which(stations$Region == x))) | |
# Re-project NUTS1regions to British National Grid to get areas in Km2 | |
NUTS1regions = spTransform(NUTS1regions, CRS("+init=epsg:27700")) | |
df$`Area (Km2)` <- round(sapply(slot(NUTS1regions, "polygons"), | |
slot, "area")/10^6, 4) | |
df$Density <- round(as.numeric(df$`# stations`)/ | |
as.numeric(as.character(df$`Area (Km2)`)), 3) | |
library(knitr) | |
kable(df) | |
NI <- stations$RecordedYears[which(stations$Region == "Northern Ireland")] | |
``` | |
It is trivial to summarise the metada by region, for instance the boxplot below | |
shows the distribution of years of recording. Northern Ireland seem to have the | |
most homogeneous network, with recording years in the following range: | |
[`r summary(NI)[[1]]`, `r summary(NI)[[6]]`]. | |
Only three regions have stations with more than 100 years of recordings: | |
East of England, London and Wales. | |
```{r boxplot} | |
# Simplify labels | |
stations$Region <- gsub(' and the Humber', '', stations$Region) | |
stations$Region <- gsub(' \\(England\\)', '', stations$Region) | |
# Customise margins | |
par(mar=c(7.5,4,1,1)) | |
boxplot(RecordedYears ~ Region, data = stations, las = 2, | |
ylab="Years of recordings") | |
abline(h = 100, col="red", lty = 2) | |
``` |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment