Skip to content

Instantly share code, notes, and snippets.

@ATpoint
Last active October 15, 2024 13:38
Show Gist options
  • Select an option

  • Save ATpoint/c9293b2d6182ddc1a418b1c9dba4985a to your computer and use it in GitHub Desktop.

Select an option

Save ATpoint/c9293b2d6182ddc1a418b1c9dba4985a to your computer and use it in GitHub Desktop.
Efficient calculation of percentage of expression per cluster, for example for scRNA-seq.
#' Given a matrix-like object with genes being rows, samples/cells being columns,
#' and a group information for the columns, calculate the percentage of expression per group.
#' No dependencies other than base R. The CsparseMatrix format, commonly used in scRNA-seq is supported.
#'
#' @param data numeric matrix or data.frame-like object
#' @param group a vector (character or factor)
#' @param threshold values above this threshold are considered expressed
#' @param digits round results to this number of digits
#'
#' @examples
#' data <- matrix(rnorm(100, 2, 2), ncol=10)
#' group <- rep(c(1,2), each=5)
#' getPercentExpression(data, group, 0, 2)
#'
getPercentExpression <- function(data, group, threshold=0, digits=2){
if(ncol(data)!=length(group)) stop("ncol(data) != length(group)")
if(!is.numeric(threshold) | threshold < 0) stop("threshold must be numeric and > 0")
datar <- (data > threshold) * 1
a <- base::rowsum(x=as.matrix(t(datar)), group=group)
b <- as.numeric(table(group)[rownames(a)])
f <- round(100*t(a/b), digits=digits)
f
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment