Skip to content

Instantly share code, notes, and snippets.

@dwinter
Created February 11, 2012 01:19
Show Gist options
  • Save dwinter/1794907 to your computer and use it in GitHub Desktop.
Save dwinter/1794907 to your computer and use it in GitHub Desktop.
Dealing with entrez in R
entrez_search <- function(dbase, term, retmax=6,...){
base_url <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=%s&term=%s&retmax=%i"
search <- sprintf(base_url, dbase, term, retmax)
raw_result <- getURL(search)
ids <- unlist(getNodeSet(xmlParse(raw_result), "//Id", fun=xmlValue))
return(as.integer(ids))
}
entrez_fetch <- function(dbase, ids, format, ...){
base_url <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutilpos/efetch.fcgi?db=%s&id=%s&rettype=%s&retmode=text"
url_string <- sprintf(base_url, dbase, paste(ids, collapse=","), format)
records <- getURL(url_string)
return(records)
}
#I like spiders, so let's find NCBI popsets containing the local redback/blackwidow
#relative the katipo
katipo <- entrez_search("popset", "Latrodectus katipo[Organism]", retmax=5)
#for each id, fetch the dataset as a fasta file
records <- sapply(katipo, function(pop_id) entrez_fetch("popset", pop_id, "fasta") )
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment