Skip to content

Instantly share code, notes, and snippets.

@boshek
Last active October 11, 2017 23:48
Show Gist options
  • Save boshek/945a006e5f4f767a11d6d75f4fa0d543 to your computer and use it in GitHub Desktop.
Save boshek/945a006e5f4f767a11d6d75f4fa0d543 to your computer and use it in GitHub Desktop.
Correlation Network Plots by Group with purrr, corrr and ggraph

Correlation Network Plots by Groups

Packages

To run this you will need the following packages:

library(corrr)
library(dplyr)
library(purrr)
library(tidyr)
library(nycflights13)
library(igraph)
library(ggraph)

nycflights13 Weather data

Data Munging

My objective here is to demonstrate how to visualize a correlation matrix using a network plot grouped by a variable. In this case we are interested in weather data grouped by airport (origin) from the weather data in the nycflights13 package. Our first step is to do some basic data munging. This gives us some numeric variables and the grouping variable (origin) that we can work with:

weather_sub <- weather %>%
  group_by(origin) %>%
  select(-(year:hour)) %>%
  select_if(is.numeric)

Correlations with corrr and purrr

We are using the corrr package to evaluate these relationships. The trick is that we need to evaluate this on the basis of the group - in this case origin. We can make use of the map function from the purrr package. We are mapping the stretch and correlate functions over the weather$origin vector then filtering for correlation coefficients over the absolute value of 0.3 then converting the data into a suitable format for ggraph. Note that we need to directly call compose from purrr:

weather_cor <- weather_sub %>%
  group_by(origin) %>% ## redundant but worth it for illustration
  nest() %>%
  mutate(data = map(data, purrr::compose(stretch, correlate))) %>% 
  unnest() %>%
  select(x, y, r, origin) %>%
  filter(abs(r) > .3) %>%
  graph_from_data_frame(directed = FALSE)

Plot with ggraph

ggraph(weather_cor, layout = "kk") +
  geom_edge_link(aes(edge_alpha = abs(r), color = r), edge_width = 5) +
  guides(edge_alpha = "none") +
  scale_edge_colour_gradientn(limits = c(-1, 1), colors = heat.colors(5)) +
  geom_node_point(color = "black", size = 4) +
  geom_node_text(aes(label = name), repel = TRUE) +
  facet_edges(~origin) +
  theme_minimal() 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment