Skip to content

Instantly share code, notes, and snippets.

@tomsing1
Created December 28, 2022 04:27
Show Gist options
  • Select an option

  • Save tomsing1/3d4f994971fc235bed76e5d0ba0855d9 to your computer and use it in GitHub Desktop.

Select an option

Save tomsing1/3d4f994971fc235bed76e5d0ba0855d9 to your computer and use it in GitHub Desktop.
Retrieving Gene ontology (GO) annotations using Bioconductor annotation packages
library(AnnotationDbi)
library(org.Hs.eg.db)
library(GO.db)
kTerm <- "GO:0007265"
# retrieve all genes annotated with the GO germ
df <- AnnotationDbi::select(org.Hs.eg.db, keys = c(kTerm),
columns = c("ENTREZID", "ENSEMBL"),
keytype = "GO")
nrow(df) # 117 (some duplicate EntrezIds because we requested Ensembl ids, too)
length(unique(df$ENTREZID)) # 89
# retrieve all genes annotated with term GO:0008150 or any of its ancestor terms
df <- AnnotationDbi::select(org.Hs.eg.db, keys = c(kTerm),
columns = c("ENTREZID", "ENSEMBL"),
keytype = "GOALL")
length(unique(df$ENTREZID)) # 353
# retrieve GO annotations (terms and ancestor terms)
tbl <- AnnotationDbi::toTable(org.Hs.egGO2ALLEGS)
nrow(tbl) # 3.4 million rows
table(tbl$Evidence) # see http://geneontology.org/docs/guide-go-evidence-codes/
# retrieve the TERM and DEFINITON for a (or multiple) GO terms
AnnotationDbi::select(GO.db, keys=kTerm, keytype="GOID", columns=columns(GO.db))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment