cvitolo · September 1, 2016 15:38
diff --git a/rnrfa_stationsEU-NUTS1regions.Rmd b/rnrfa_stationsEU-NUTS1regions.Rmd
 ---
 title: "NRFA stations in EU-NUTS1 regions"
 author: "Claudia Vitolo"
 date: "13 March 2016"
 output: html_document
 ---

 ```{r setup, include=FALSE}
 knitr::opts_chunk$set(echo = FALSE, 
                      message = FALSE,
                      cache = FALSE,
                      warning = FALSE)
 ```

 The Nomenclature of Territorial Units for Statistics (NUTS), is a standard for 
 referencing the administrative divisions of European countries. There are three 
 levels of NUTS defined. The United Kingdom is divided, at the first level of 
 NUTS, into 12 regions. 

 It is possible to identify which NRFA station falls within each region using 
 simple geo-processing operations. The procedure makes use of the __catalogue()__
 function to get the list of stations and the __rgdal__ package to identify the 
 name of the region each station falls within. Finally, a new column, containing 
 the name of the region, is appended to the list of stations. The resulting 
 object is also available in the data folder of the __rnrfa__ package, under the 
 name __stationSummary__. 

 ```{r regions}
 library(rgdal)
 # Download Eurostats - Administrative units from: 
 # http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2013_01M_SH.zip
 temp <- tempfile()
 download.file("http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2013_01M_SH.zip",temp)
 unzip(temp)

 # Read shapefiles and set CRS
 NUTS1regions <- readOGR(dsn = "NUTS_2013_01M_SH/data", 
                        layer = "NUTS_RG_01M_2013", 
                        verbose=FALSE)
 rows <- which(NUTS1regions@data$STAT_LEVL_ == 1 &
                (substr(NUTS1regions@data$NUTS_ID, 1, 2) == "UK"))
 NUTS1regions <- NUTS1regions[rows,]
 NUTS1regions <- spTransform(NUTS1regions, CRS("+init=epsg:4326"))

 # Lookup table for names and IDs
 df <- data.frame('NUTS_ID' = c('UKC', 'UKD', 'UKE', 'UKF', 
                               'UKG', 'UKH', 'UKI', 'UKJ', 'UKK',
                               'UKL', 'UKM', 'UKN'),
                 'Region' = c('North East (England)', 'North West (England)', 
                              'Yorkshire and the Humber', 'East Midlands',
                              'West Midlands', 'East of England', 'Greater London', 
                              'South East (England)', 'South West (England)', 
                              'Wales', 'Scotland', 'Northern Ireland'))

 # Join the two tables
 library(dplyr)
 NUTS1regions@data <- NUTS1regions@data %>% left_join(df)

 # Load stations and set CRS
 library(rnrfa)
 stations <- catalogue()
 coordinates(stations) <- ~lon + lat
 proj4string(stations) <- CRS("+init=epsg:4326")

 # Add empty ID column
 stations@data$NUTS_ID <- NA

 # Populate the ID column
 for (id in NUTS1regions@data$NUTS_ID){
  temp <- stations %over% NUTS1regions[NUTS1regions@data$NUTS_ID == id,]
  rows <- which(!is.na(temp$NUTS_ID))
  stations@data$NUTS_ID[rows] <- id
 }

 # Join stations and df to add the Region column
 stations@data <- stations@data %>% left_join(df)

 # Compute number of recorded years
 stations$RecordedYears <- stations$gdfEnd - stations$gdfStart
 stations <- as.data.frame(stations)
 ```

 The table below summarises the number of stations per region, the area of each 
 region (in Km2), and the density of stations (number of stations/Km2).

 ```{r density}
 df$`# stations` <- lapply(df$Region,
 function(x) length(which(stations$Region == x)))
 # Re-project NUTS1regions to British National Grid to get areas in Km2
 NUTS1regions = spTransform(NUTS1regions, CRS("+init=epsg:27700"))
 df$`Area (Km2)` <- round(sapply(slot(NUTS1regions, "polygons"), 
 slot, "area")/10^6, 4)
 df$Density <- round(as.numeric(df$`# stations`)/
 as.numeric(as.character(df$`Area (Km2)`)), 3)
 library(knitr)
 kable(df)

 NI <- stations$RecordedYears[which(stations$Region == "Northern Ireland")]
 ```

 It is trivial to summarise the metada by region, for instance the boxplot below
 shows the distribution of years of recording. Northern Ireland seem to have the 
 most homogeneous network, with recording years in the following range: 
 [`r summary(NI)[[1]]`, `r summary(NI)[[6]]`]. 
 Only three regions have stations with more than 100 years of recordings: 
 East of England, London and Wales.

 ```{r boxplot}
 # Simplify labels
 stations$Region <- gsub(' and the Humber', '', stations$Region)
 stations$Region <- gsub(' \\(England\\)', '', stations$Region)

 # Customise margins
 par(mar=c(7.5,4,1,1))
 boxplot(RecordedYears ~ Region, data = stations, las = 2, 
        ylab="Years of recordings")
 abline(h = 100, col="red", lty = 2)
 ```
	---
	title: "NRFA stations in EU-NUTS1 regions"
	author: "Claudia Vitolo"
	date: "13 March 2016"
	output: html_document
	---

	```{r setup, include=FALSE}
	knitr::opts_chunk$set(echo = FALSE,
	message = FALSE,
	cache = FALSE,
	warning = FALSE)
	```

	The Nomenclature of Territorial Units for Statistics (NUTS), is a standard for
	referencing the administrative divisions of European countries. There are three
	levels of NUTS defined. The United Kingdom is divided, at the first level of
	NUTS, into 12 regions.

	It is possible to identify which NRFA station falls within each region using
	simple geo-processing operations. The procedure makes use of the __catalogue()__
	function to get the list of stations and the __rgdal__ package to identify the
	name of the region each station falls within. Finally, a new column, containing
	the name of the region, is appended to the list of stations. The resulting
	object is also available in the data folder of the __rnrfa__ package, under the
	name __stationSummary__.

	```{r regions}
	library(rgdal)
	# Download Eurostats - Administrative units from:
	# http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2013_01M_SH.zip
	temp <- tempfile()
	download.file("http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2013_01M_SH.zip",temp)
	unzip(temp)

	# Read shapefiles and set CRS
	NUTS1regions <- readOGR(dsn = "NUTS_2013_01M_SH/data",
	layer = "NUTS_RG_01M_2013",
	verbose=FALSE)
	rows <- which(NUTS1regions@data$STAT_LEVL_ == 1 &
	(substr(NUTS1regions@data$NUTS_ID, 1, 2) == "UK"))
	NUTS1regions <- NUTS1regions[rows,]
	NUTS1regions <- spTransform(NUTS1regions, CRS("+init=epsg:4326"))

	# Lookup table for names and IDs
	df <- data.frame('NUTS_ID' = c('UKC', 'UKD', 'UKE', 'UKF',
	'UKG', 'UKH', 'UKI', 'UKJ', 'UKK',
	'UKL', 'UKM', 'UKN'),
	'Region' = c('North East (England)', 'North West (England)',
	'Yorkshire and the Humber', 'East Midlands',
	'West Midlands', 'East of England', 'Greater London',
	'South East (England)', 'South West (England)',
	'Wales', 'Scotland', 'Northern Ireland'))

	# Join the two tables
	library(dplyr)
	NUTS1regions@data <- NUTS1regions@data %>% left_join(df)

	# Load stations and set CRS
	library(rnrfa)
	stations <- catalogue()
	coordinates(stations) <- ~lon + lat
	proj4string(stations) <- CRS("+init=epsg:4326")

	# Add empty ID column
	stations@data$NUTS_ID <- NA

	# Populate the ID column
	for (id in NUTS1regions@data$NUTS_ID){
	temp <- stations %over% NUTS1regions[NUTS1regions@data$NUTS_ID == id,]
	rows <- which(!is.na(temp$NUTS_ID))
	stations@data$NUTS_ID[rows] <- id
	}

	# Join stations and df to add the Region column
	stations@data <- stations@data %>% left_join(df)

	# Compute number of recorded years
	stations$RecordedYears <- stations$gdfEnd - stations$gdfStart
	stations <- as.data.frame(stations)
	```

	The table below summarises the number of stations per region, the area of each
	region (in Km2), and the density of stations (number of stations/Km2).

	```{r density}
	df$`# stations` <- lapply(df$Region,
	function(x) length(which(stations$Region == x)))
	# Re-project NUTS1regions to British National Grid to get areas in Km2
	NUTS1regions = spTransform(NUTS1regions, CRS("+init=epsg:27700"))
	df$`Area (Km2)` <- round(sapply(slot(NUTS1regions, "polygons"),
	slot, "area")/10^6, 4)
	df$Density <- round(as.numeric(df$`# stations`)/
	as.numeric(as.character(df$`Area (Km2)`)), 3)
	library(knitr)
	kable(df)

	NI <- stations$RecordedYears[which(stations$Region == "Northern Ireland")]
	```

	It is trivial to summarise the metada by region, for instance the boxplot below
	shows the distribution of years of recording. Northern Ireland seem to have the
	most homogeneous network, with recording years in the following range:
	[`r summary(NI)[[1]]`, `r summary(NI)[[6]]`].
	Only three regions have stations with more than 100 years of recordings:
	East of England, London and Wales.

	```{r boxplot}
	# Simplify labels
	stations$Region <- gsub(' and the Humber', '', stations$Region)
	stations$Region <- gsub(' \\(England\\)', '', stations$Region)

	# Customise margins
	par(mar=c(7.5,4,1,1))
	boxplot(RecordedYears ~ Region, data = stations, las = 2,
	ylab="Years of recordings")
	abline(h = 100, col="red", lty = 2)
	```