Skip to content

Instantly share code, notes, and snippets.

@crazyhottommy
Created October 22, 2018 03:11
Show Gist options
  • Save crazyhottommy/168356bbe02bc29f01e1cd4a678a786d to your computer and use it in GitHub Desktop.
Save crazyhottommy/168356bbe02bc29f01e1cd4a678a786d to your computer and use it in GitHub Desktop.
get mouse gene length
# stop using biocLite https://twitter.com/strnr/status/1022451016736927745?lang=en
# more details https://cran.r-project.org/web/packages/BiocManager/vignettes/BiocManager.html
# require R>=3.5.0, if you have a lower version of R. you may still use biocLite to install 
# bioconductor packages

install.packages("BiocManager")
BiocManager::install("TxDb.Mmusculus.UCSC.mm9.knownGene")
BiocManager::install("org.Mm.eg.db")
# you can install mm10
# BiocManager::install("TxDb.Mmusculus.UCSC.mm10.knownGene")
library(TxDb.Mmusculus.UCSC.mm9.knownGene)
library(org.Mm.eg.db)
library(readr)

txdb <- TxDb.Mmusculus.UCSC.mm9.knownGene
mm9_genes<- genes(txdb)
mm9_genes

## note that dplyr and AnnotationDbi both have a function called select
## use dplyr::select when use dplyr

# map the Entrez ID to gene symbol
gene_symbol<- AnnotationDbi::select(org.Mm.eg.db, keys=mm9_genes$gene_id, 
                                    columns="SYMBOL", keytype="ENTREZID")

all.equal(mm9_genes$gene_id, gene_symbol$ENTREZID)

mm9_genes$symbol<- gene_symbol$SYMBOL

mm9_genes
# length of each gene
width(mm9_genes)

df<- data.frame(EntrezID = mm9_genes$gene_id, Symbol = mm9_genes$symbol, Gene_length = width(mm9_genes))

write_tsv(df, "~/mm9_gene_legnth.txt")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment