Skip to content

Instantly share code, notes, and snippets.

@vjcitn
Created December 7, 2019 18:16
Show Gist options
  • Save vjcitn/3911446d62db3fbb1c15be54116ba989 to your computer and use it in GitHub Desktop.
Save vjcitn/3911446d62db3fbb1c15be54116ba989 to your computer and use it in GitHub Desktop.
highly_vbl R function from Brussels talk
# genesym is a gene symbol as a character(1)
# compend is a SummarizedExperiment 'like' result of HumanTranscriptomeCompendium::htx_load()
# stat is a function that will compute a statistic on the log(selected gene's expression+1) over samples
# pctile is the percentile for selecting studies
highly_vbl = function(genesym, compend, stat=mad, pctile=.9) {
stopifnot("gene_name" %in% colnames(rowData(compend)))
stopifnot("study_accession" %in% colnames(colData(compend)))
stopifnot("study_title" %in% colnames(colData(compend)))
ind = which(rowData(compend)$gene_name == genesym)[1]
if (is.na(ind)) stop("could not find genesym in compend")
thr = quantile(study_stats <-
vapply(split(as.numeric(log(assay(compend[ind,])+1)),
compend$study_accession), stat, numeric(1)),pctile)
kp = which(study_stats > thr)
as.data.frame(colData(compend)) %>% filter(study_accession %in% names(kp)) %>%
group_by(study_title) %>% summarise(n=n()) %>% arrange(desc(n))
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment