Skip to content

Instantly share code, notes, and snippets.

@timcdlucas
Created July 23, 2015 21:56
Show Gist options
  • Save timcdlucas/5e0191e5cc74c9e66776 to your computer and use it in GitHub Desktop.
Save timcdlucas/5e0191e5cc74c9e66776 to your computer and use it in GitHub Desktop.
Scrape the number of references found on google scholar.
library(rvest)
library(magrittr)
# Our search string
sp <- "Myotis myotis"
spString <- tolower(gsub(' ', '+', sp))
url <- paste0('https://scholar.google.co.uk/scholar?hl=en&q=%22',
spString, '%22&btnG=&as_sdt=1%2C5&as_sdtp=')
#Download the webpage
page <- html(url)
# Extract the number of references.
refs <- page %>%
html_node('#gs_ab_md') %>%
html_text() %>%
gsub('About\\s(.*)\\sresults.*', '\\1', .) %>%
gsub(',', '', .) %>%
as.numeric
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment