Created
September 10, 2014 20:42
-
-
Save randyzwitch/008be202b94bde7c4359 to your computer and use it in GitHub Desktop.
RSiteCatalyst Sankey Diagram - Single Page to Multiple Pages
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library("RSiteCatalyst") | |
library("d3Network") | |
#### Authentication | |
SCAuth("key", "secret") | |
#### Get Pathing data: Single page, then ::anything:: pattern | |
pathpattern <- c("http://randyzwitch.com/big-data-hadoop-amazon-ec2-cloudera-part-1", "::anything::") | |
next_page <- QueuePathing("zwitchdev", | |
"2014-01-01", | |
"2014-08-31", | |
metric="pageviews", | |
element="page", | |
pathpattern, | |
top = 50000) | |
#Optional step: Cleaning my pagename URLs to remove to domain for clarity | |
next_page$step.1 <- sub("http://randyzwitch.com/","", | |
next_page$step.1, ignore.case = TRUE) | |
next_page$step.2 <- sub("http://randyzwitch.com/","", | |
next_page$step.2, ignore.case = TRUE) | |
#Get unique values of page name to create nodes df | |
#Create an index value, starting at 0 | |
nodes <- as.data.frame(unique(c(next_page$step.1, next_page$step.2))) | |
names(nodes) <- "name" | |
nodes$nodevalue <- as.numeric(row.names(nodes)) - 1 | |
#Convert string to numeric nodeid | |
links <- merge(next_page, nodes, by.x="step.1", by.y="name") | |
names(links) <- c("step.1", "step.2", "value", "source") | |
links <- merge(links, nodes, by.x="step.2", by.y="name") | |
names(links) <- c("step.1", "step.2", "value", "source", "target") | |
#Create next page Sankey chart | |
d3output = "C:/Users/rzwitc200/Desktop/sankey.html" | |
d3Sankey(Links = links, Nodes = nodes, Source = "source", | |
Target = "target", Value = "value", NodeID = "name", | |
fontsize = 12, nodeWidth = 100, file = d3output, width = 750, height = 600) |
Oh and you probably saw my linkedin notification because I couldn't find a way to send my gratitude when browsing your site. My name is Kalen Daniel, and I do web analytics and work with the implementations as well as create reports/visualizations (well dumb-versions compared to R at least! with excel/google sheets and scripts from clickstream data which has been painful to say the least). Just mentioning that because you may have seen when I was poking around to find out more about your work.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Your work is very fascinating Randy. It has inspired me to take some programming & data/statistics classes (which I'm already enrolled in after finding these gems last month). Thank you so much for everything you have done with RSiteCatalyst. I swore I saw some kind of link for an Adobe Summit 2017 session you had or that you were associated with earlier this year. I'm kicking myself in the foot that I didn't get a chance to attend that session if you did in fact hold a session at the Summit.