Created
November 9, 2018 16:30
-
-
Save FloWuenne/fca174d38ffd81c513fbd979e34f7062 to your computer and use it in GitHub Desktop.
Quick testing of the Jensen-Shannon implementation in the philentropy package
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Testing of JSD function in philentrop package | |
## Documentation at: | |
## https://www.rdocumentation.org/packages/philentropy/versions/0.2.0/topics/JSD | |
## Load library | |
library(philentropy) | |
## We will test the Jensen-Shannon divergence with the aplpication of scoring gene regulatory networks as described in this \ | |
## paper: https://www.cell.com/cell-reports/fulltext/S2211-1247(18)31634-6?dgcid=raven_jbs_etoc_email#secsectitle0070 | |
## In this article, the authors used two distributions to represent regulon activity and cell identity | |
## Distribution 1: Regulon activity score (RAS) normalized so they sum up to 1 | |
## Distribution 2: Cell identity as 1 or 0 for a specific cell type normalized so they sum up to 1 | |
## Now let's use some very simply toy examples to see whether the JDS function works as we would expect and to learn how to | |
## run it correctly | |
## First, we will test perfect overlap between the two distributions, that is, we have high RAS values in all cells of that cell type | |
## and 0 RAS values in all other cells. This is obviously not representative of biology but gives us a positive control | |
## to test the Jensen-Shannon divergence. With such perfectly correlating distributions, the Jensen-Shannon divergence should be 0 | |
## and the Jensen-Shannon distance, which is what the authors used and call the Regulon specificty score (RSS) should be 1 | |
## Make RAS distribution | |
dist1 <- c(100,0,0,100,0,100) | |
dist1_norm <- dist1/sum(dist1) | |
## Make cell identity distribution | |
dist2 <- c(1,0,0,1,0,1) | |
dist2_norm <- dist2/sum(dist2) | |
## Put distributions in a data frame | |
dist_df <- rbind(dist1_norm,dist2_norm) | |
## Calculate the Jensen-Shannon divergence | |
jsd_divergence <- philentropyJSD(dist_df) | |
## Calculate Jensen-Shannon distance | |
jsd_distance <- 1-sqrt(jsd_divergence) | |
## Next we will test the opposite scenario where RAS scores are completely independent of cell identities. | |
## Negative control | |
## Make RAS distribution | |
dist1 <- c(100,0,0,100,0,100) | |
dist1_norm <- dist1/sum(dist1) | |
## Make cell identity distribution | |
dist2 <- c(0,1,1,0,1,0) | |
dist2_norm <- dist2/sum(dist2) | |
## Put distributions in a data frame | |
dist_df <- rbind(dist1_norm,dist2_norm) | |
## Calculate the Jensen-Shannon divergence | |
jsd_divergence <- philentropyJSD(dist_df) | |
## Calculate Jensen-Shannon distance | |
jsd_distance <- 1-sqrt(jsd_divergence) | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment