Skip to content

Instantly share code, notes, and snippets.

@thoughtfulbloke
Created April 12, 2021 03:04
Show Gist options
  • Save thoughtfulbloke/724d0180dfd116a94441faff02ba994d to your computer and use it in GitHub Desktop.
Save thoughtfulbloke/724d0180dfd116a94441faff02ba994d to your computer and use it in GitHub Desktop.
library(rvest)
target_url <- "https://www.publicservice.govt.nz/resources/proactive-releases/"
webpage <- target_url %>% read_html()
links <- webpage %>% html_nodes("a") %>% html_attr("href")
link_text <- webpage %>% html_nodes("a") %>% html_text()
link_set <- data.frame(links, link_text, stringsAsFactors = FALSE)
if(file.exists("linkcurrent.csv")){
linkcurrent <- read.csv("linkcurrent.csv", stringsAsFactors = FALSE)
newlinks <- link_set[!link_set$links %in% linkcurrent$links,]
} else {
newlinks <- link_set
}
write.csv(link_set, file = "linkcurrent.csv", row.names=FALSE)
write.csv(newlinks, file = "newlinks.csv", row.names=FALSE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment