Skip to content

Instantly share code, notes, and snippets.

@kozo2
Created November 29, 2019 21:54
Show Gist options
  • Save kozo2/97086b76c6e009d91f6ae28384332c76 to your computer and use it in GitHub Desktop.
Save kozo2/97086b76c6e009d91f6ae28384332c76 to your computer and use it in GitHub Desktop.
---
title: "Functional Enrichment Analysis"
author: "Kozo Nishida, Kristina Hanspers and Alex Pico"
date: "`r Sys.Date()`"
output:
BiocStyle::html_document:
toc: true
toc_depth: 4
df_print: paged
---
```{r, echo = FALSE}
knitr::opts_chunk$set(
eval=FALSE
)
```
*The R markdown is available from the pulldown menu for* Code *at the upper-right, choose "Download Rmd", or [download the Rmd from GitHub](https://raw.githubusercontent.com/nrnb/gsod2019_kozo_nishida/master/functional-enrichment.Rmd).*
<hr />
This tutorial will take you through a functional enrichment workflow in Cytoscape. Key aspects covered in this tutorial are:
- Retrieve Networks and Pathways
- Integrate and Explore Your Data
- Perform Functional Enrichment Analysis
- Functional Interpretation and Display
<hr />
# Installation
```{r, eval = FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
if(!"RCy3" %in% installed.packages())
BiocManager::install("RCy3")
library(RCy3)
```
# Getting started
First, launch Cytoscape and keep it running whenever using RCy3. Confirm that you have everything installed and running:
```{r}
cytoscapePing()
cytoscapeVersionInfo()
```
# Setup
Install the [Functional Enrichment Collection](http://apps.cytoscape.org/apps/functionalenrichmentcollection).
(This installation takes time to complete. You can check the installation progress in the status bar of the Cytoscape desktop.)
```{r}
installApp('Functional Enrichment Collection')
```
And install the other R packages for this tutorial
```{r}
install.packages("dplyr")
```
# Load TCGA data
**Create the network (with no edges) with the data.**
- Read [GBM-TP.all.tsv](https://cytoscape.github.io/cytoscape-tutorials/protocols/data/GBM-TP-all.tsv).
```{r}
df <- read.table("https://cytoscape.github.io/cytoscape-tutorials/protocols/data/GBM-TP-all.tsv", sep = '\t', header = TRUE)
```
- Import the file to create a network. We set **GeneName** column as the **node ID** column.
```{r}
library(dplyr)
df %>% mutate(id = GeneName) -> df4createnet
```
```{r}
createNetworkFromDataFrames(df4createnet)
```
- You’ll see no edges, but that’s OK. This will create a grid of 1500 unconnected nodes, where each node represents a gene.
- This will import all of the data in the spreadsheet and associate each row with the corresponding node.
- You should be able to see this in the **Table Panel**.
# Find significant expression changes
We’ll use the `createColumnFilter` function to find the significant overall expression changes.
- In this case we’re going to add a Column Filter.
- This is for the "Mean.log2FC" column and we set the values to be between -5 and -1.
- This will select all genes that are significantly underexpressed on average, across all samples (only one gene should be selected).
```{r}
getTableColumnNames('node')
```
```{r}
createColumnFilter('filter1', 'Mean.log2FC', c(-5,-1), "BETWEEN")
```
- Repeat the same process as above, but set the values to be between 1 and 5.
```{r}
createColumnFilter('filter2', 'Mean.log2FC', c(1,5), "BETWEEN")
```
- You can combine filters to create a filter that matches any one of the filters.
```{r}
createCompositeFilter("Default filter", c("filter1", "filter2"), "ANY")
```
- No additional genes will be selected in this case (but this approach will be reused later).
Only one gene shows significant average expression changes across all samples, **RPS4Y1**, a ribosomal protein that has been shown to be linked to glioblastoma, but it’s still a relatively unsatisfying result that only a single protein changes expression in glioblastoma patients.
A closer examination might be in order...
# Cluster to see expression patterns
Let’s cluster all of the data to see if there are any significant patterns that have been averaged out by looking at all samples in the aggregate.
**clusterMaker2** is a Cytoscape app that provides a number of clustering algorithms including hierarchical and k-means.
- **Perform hierarchical clustering**.
```{r}
cluster.cmd = 'cluster hierarchical metric=Pearson nodeAttributeList= showUI='
commandsRun(cluster.cmd)
```
In Cytoscape, select Apps → clusterMaker → Hierarchical cluster. This will bring the settings for hierarchical clustering. Select Pearson correlation for Distance metric andSelect all of the samples (columns beginning with TCGA-) in the Node attributes for cluster list. The checkboxes Cluster attributes as well as nodes, Ignore nodes/edges with no data, and Show TreeView when complete should all be checked. Then click OK. The resulting TreeView should look like the figure on the next slide.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment