Skip to content

Instantly share code, notes, and snippets.

@abikoushi
Last active May 21, 2024 06:50
Show Gist options
  • Save abikoushi/eca5663e60df2bcbd28bd58ea7e9d932 to your computer and use it in GitHub Desktop.
Save abikoushi/eca5663e60df2bcbd28bd58ea7e9d932 to your computer and use it in GitHub Desktop.
Search NCBI using R
library(rvest)
library(dplyr)
#browseURL(paste0(url_ncbi,qt))
qt <- "ID3" #searh query
NCBI_search <- function(qt){
url_ncbi <- "https://www.ncbi.nlm.nih.gov/gene/?term="
url_t <- paste0(url_ncbi,qt)
html_t <- read_html(url_t)
df_t <- html_table(html_t)
df_t <- df_t[[which(sapply(df_t, function(x)any(colnames(x)=="Description")))]]
human <- dplyr::filter(df_t, grepl("human", Description)) %>%
mutate(id = unlist(strsplit(`Name/Gene ID`, " "))[4])
url_g <- paste0("https://www.ncbi.nlm.nih.gov/gene/", human$id)
html_g <- read_html(url_g)
txt_g <- html_element(html_g, xpath = '//*[@id="summaryDl"]') %>%
html_text() %>%
strsplit(split = "\n")
i <- grep("Summary$",txt_g[[1]])
res <- gsub("^ +", "", txt_g[[1]][i+1])
return(res)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment