Skip to content

Instantly share code, notes, and snippets.

@taddallas
Last active April 27, 2022 13:02
Show Gist options
  • Save taddallas/cbf0589df1fa9051d774b2d282b6e635 to your computer and use it in GitHub Desktop.
Save taddallas/cbf0589df1fa9051d774b2d282b6e635 to your computer and use it in GitHub Desktop.
R script to scrape citation counts from the Web of Science. Used for a project involving host-parasite interactions, but may be useful to someone out there. It's finicky, and I take no responsibility for it. Best of luck.
# RSelenium method to scrape citation counts from WOS
**Updates**: RSelenium package has switched some of the code around, causing breaking changes to this old gist. Web of Science has also changed their layout, causing breaking changes. The current code is a hacky patch to allow for some WoS interactivity and data acquistion.
### Currently set up to work with the RSelenium docker image
library(RSelenium)
library(plyr); library(dplyr)
library(RSelenium)
driver <- rsDriver(port = 4567L,
browser = "firefox",
version = "latest"
)
# https://resulumit.com/teaching/scrp_workshop.html#190
browser <- driver$client
server <- driver$server
browser$navigate("https://www.webofscience.com/wos/woscc/advanced-search")
query <- c('mus musculus', 'gorilla gorilla')
ret <- vector()
for(i in 1:length(query)){
# find where to enter text, and then enter text
webElem <- browser$findElement(using = 'css', value = "textarea.search-criteria-input")
webElem$sendKeysToElement(list(paste("TS=('", query[i], "' AND 'parasite')", sep=' '), "\uE007"))
Sys.sleep(5)
browser$goBack()
Sys.sleep(5)
webElem <- browser$findElement(using = 'css', value = "textarea.search-criteria-input")
webElem$clearElement()
print(i)
}
webElem <- browser$findElement(using = 'css', value = "app-history-entries-list.ng-star-inserted")
ret <- webElem$getElementText()
driver[['server']]$stop()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment