David Winter dwinter

 library(ape)
 data(woodmouse)
 #Just the third codon (assuming alignment is in-frame)
 woodmouse[, seq(1, ncol(woodmouse), by=3) ]

## 15 DNA sequences in binary format stored in a matrix.
##

#Finding the number of sequence records by taxonomic group

Ricardo wants to know how to find the number of sequence records associated with sub-groups within a given taxon . This example grew a bit too big to make into a comment, so here it is in gist form.

So, let's find out how many DNA sequences are present in genbank for each

For loops can (but don't have to) be really slow in R, so I wanted to compare answers provided to this question to compare a straightforward for loop approach, to various work arounds to speed the process up.

First, the functions

f_for <- function(a,b){
  res <- c()

#Demo for com call

library(XML)
load_all("~/src/rentrez")

	```{r, echo=FALSE}
	knitr::opts_knit$set(upload.fun = knitr::imgur_upload, base.url = NULL)
	```


	# and now, all animals and with nested taxonomic ranks

	So, a few people liked [this example](https://gist.github.com/dwinter/8d7bde0579daf7466508)
	of using `rentrez` to investigate the taxonomic distribution of sequences in
	Genbank. I though it might be fun to extend it a little. Specifically:

	library(rcrossref)

	add_to_bib <- function(doi, bib="ms.bib", print_ref=TRUE){
	ref <- cr_cn(doi)
	cat("\n", ref, "\n\n", file=bib, append=TRUE)
	if(print_ref){
	cat(ref, "\n")
	}
	}

	traits <- read.csv("flag_traits.csv")
	dist_mat <- dist(traits, "manhattan")
	tr <- ape::nj(dist_mat)
	plot(tr)

	##Rooting the tree (arbitrarily) will make a nicer plot
	#rooted <- ape::root(tr, node=50)
	#plot(rooted)

	#Rentrez 1.0 released

	A new version of `rentrez`, our package for the NCBI's EUtils API, is making
	it's way around the CRAN mirrors. This release represents a substantial
	mprovement to `rentrez`, including a [new vignette](https://cran.r-project.org/web/packages/rentrez/vignettes/rentrez_tutorial.html)
	that documents the whole package.

	This posts describes some of the new things in `rentrez`, and gives us a chance
	to thank some of the people that have contriuted to this package's development.

	#Won't work by cross linking
	```r
	tax_search <- entrez_search(db="taxonomy", term="Acetobacter[ORGN] AND genus[RANK]")
	linked_recs <- entrez_link(dbfrom="taxonomy", db="genome", id=tax_search$ids)
	linked_recs$links$taxonomy_genome
	```

	```
	#[1] "18005"
	```