kozo2 · November 29, 2019 21:54
diff --git a/functional-enrichment.Rmd b/functional-enrichment.Rmd
 ---
 title: "Functional Enrichment Analysis"
 author: "Kozo Nishida, Kristina Hanspers and Alex Pico"
 date: "`r Sys.Date()`"
 output:
  BiocStyle::html_document:
    toc: true
    toc_depth: 4
    df_print: paged
 ---
 ```{r, echo = FALSE}
 knitr::opts_chunk$set(
  eval=FALSE
 )
 ```

 *The R markdown is available from the pulldown menu for* Code *at the upper-right, choose "Download Rmd", or [download the Rmd from GitHub](https://raw.githubusercontent.com/nrnb/gsod2019_kozo_nishida/master/functional-enrichment.Rmd).*

 <hr />

 This tutorial will take you through a functional enrichment workflow in Cytoscape. Key aspects covered in this tutorial are:

 - Retrieve Networks and Pathways
 - Integrate and Explore Your Data
 - Perform Functional Enrichment Analysis
 - Functional Interpretation and Display

 <hr />

 # Installation
 ```{r, eval = FALSE}
 if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

 if(!"RCy3" %in% installed.packages())
  BiocManager::install("RCy3")

 library(RCy3)
 ```

 # Getting started
 First, launch Cytoscape and keep it running whenever using RCy3. Confirm that you have everything installed and running:
 ```{r}
    cytoscapePing()
    cytoscapeVersionInfo()
 ```

 # Setup

 Install the [Functional Enrichment Collection](http://apps.cytoscape.org/apps/functionalenrichmentcollection).
 (This installation takes time to complete. You can check the installation progress in the status bar of the Cytoscape desktop.)

 ```{r}
 installApp('Functional Enrichment Collection')
 ```

 And install the other R packages for this tutorial

 ```{r}
 install.packages("dplyr")
 ```


 # Load TCGA data

 **Create the network (with no edges) with the data.**

 - Read [GBM-TP.all.tsv](https://cytoscape.github.io/cytoscape-tutorials/protocols/data/GBM-TP-all.tsv).

 ```{r}
 df <- read.table("https://cytoscape.github.io/cytoscape-tutorials/protocols/data/GBM-TP-all.tsv", sep = '\t', header = TRUE)
 ```

 - Import the file to create a network. We set **GeneName** column as the **node ID** column.

 ```{r}
 library(dplyr)
 df %>% mutate(id = GeneName) -> df4createnet
 ```

 ```{r}
 createNetworkFromDataFrames(df4createnet)
 ```

  - You’ll see no edges, but that’s OK. This will create a grid of 1500 unconnected nodes, where each node represents a gene.
  - This will import all of the data in the spreadsheet and associate each row with the corresponding node.
  - You should be able to see this in the **Table Panel**.

 # Find significant expression changes

 We’ll use the `createColumnFilter` function to find the significant overall expression changes.

 - In this case we’re going to add a Column Filter.
 - This is for the "Mean.log2FC" column and we set the values to be between -5 and -1.
 - This will select all genes that are significantly underexpressed on average, across all samples (only one gene should be selected).

 ```{r}
 getTableColumnNames('node')
 ```

 ```{r}
 createColumnFilter('filter1', 'Mean.log2FC', c(-5,-1), "BETWEEN")
 ```

 - Repeat the same process as above, but set the values to be between 1 and 5.

 ```{r}
 createColumnFilter('filter2', 'Mean.log2FC', c(1,5), "BETWEEN")
 ```

 - You can combine filters to create a filter that matches any one of the filters.

 ```{r}
 createCompositeFilter("Default filter", c("filter1", "filter2"), "ANY")
 ```

 - No additional genes will be selected in this case (but this approach will be reused later).

 Only one gene shows significant average expression changes across all samples, **RPS4Y1**, a ribosomal protein that has been shown to be linked to glioblastoma, but it’s still a relatively unsatisfying result that only a single protein changes expression in glioblastoma patients.
 A closer examination might be in order...

 # Cluster to see expression patterns

 Let’s cluster all of the data to see if there are any significant patterns that have been averaged out by looking at all samples in the aggregate.
 **clusterMaker2** is a Cytoscape app that provides a number of clustering algorithms including hierarchical and k-means.

 - **Perform hierarchical clustering**.

 ```{r}
 cluster.cmd = 'cluster hierarchical metric=Pearson nodeAttributeList= showUI='
 commandsRun(cluster.cmd)
 ```


 In Cytoscape, select Apps → clusterMaker → Hierarchical cluster. This will bring the settings for hierarchical clustering. Select Pearson correlation for Distance metric andSelect all of the samples (columns beginning with TCGA-) in the Node attributes for cluster list. The checkboxes Cluster attributes as well as nodes, Ignore nodes/edges with no data, and Show TreeView when complete should all be checked. Then click OK. The resulting TreeView should look like the figure on the next slide.
	---
	title: "Functional Enrichment Analysis"
	author: "Kozo Nishida, Kristina Hanspers and Alex Pico"
	date: "`r Sys.Date()`"
	output:
	BiocStyle::html_document:
	toc: true
	toc_depth: 4
	df_print: paged
	---
	```{r, echo = FALSE}
	knitr::opts_chunk$set(
	eval=FALSE
	)
	```

	The R markdown is available from the pulldown menu for Code at the upper-right, choose "Download Rmd", or [download the Rmd from GitHub](https://raw.githubusercontent.com/nrnb/gsod2019_kozo_nishida/master/functional-enrichment.Rmd).

	<hr />

	This tutorial will take you through a functional enrichment workflow in Cytoscape. Key aspects covered in this tutorial are:

	- Retrieve Networks and Pathways
	- Integrate and Explore Your Data
	- Perform Functional Enrichment Analysis
	- Functional Interpretation and Display

	<hr />

	# Installation
	```{r, eval = FALSE}
	if (!requireNamespace("BiocManager", quietly = TRUE))
	install.packages("BiocManager")

	if(!"RCy3" %in% installed.packages())
	BiocManager::install("RCy3")

	library(RCy3)
	```

	# Getting started
	First, launch Cytoscape and keep it running whenever using RCy3. Confirm that you have everything installed and running:
	```{r}
	cytoscapePing()
	cytoscapeVersionInfo()
	```

	# Setup

	Install the [Functional Enrichment Collection](http://apps.cytoscape.org/apps/functionalenrichmentcollection).
	(This installation takes time to complete. You can check the installation progress in the status bar of the Cytoscape desktop.)

	```{r}
	installApp('Functional Enrichment Collection')
	```

	And install the other R packages for this tutorial

	```{r}
	install.packages("dplyr")
	```


	# Load TCGA data

	Create the network (with no edges) with the data.

	- Read [GBM-TP.all.tsv](https://cytoscape.github.io/cytoscape-tutorials/protocols/data/GBM-TP-all.tsv).

	```{r}
	df <- read.table("https://cytoscape.github.io/cytoscape-tutorials/protocols/data/GBM-TP-all.tsv", sep = '\t', header = TRUE)
	```

	- Import the file to create a network. We set GeneName column as the node ID column.

	```{r}
	library(dplyr)
	df %>% mutate(id = GeneName) -> df4createnet
	```

	```{r}
	createNetworkFromDataFrames(df4createnet)
	```

	- You’ll see no edges, but that’s OK. This will create a grid of 1500 unconnected nodes, where each node represents a gene.
	- This will import all of the data in the spreadsheet and associate each row with the corresponding node.
	- You should be able to see this in the Table Panel.

	# Find significant expression changes

	We’ll use the `createColumnFilter` function to find the significant overall expression changes.

	- In this case we’re going to add a Column Filter.
	- This is for the "Mean.log2FC" column and we set the values to be between -5 and -1.
	- This will select all genes that are significantly underexpressed on average, across all samples (only one gene should be selected).

	```{r}
	getTableColumnNames('node')
	```

	```{r}
	createColumnFilter('filter1', 'Mean.log2FC', c(-5,-1), "BETWEEN")
	```

	- Repeat the same process as above, but set the values to be between 1 and 5.

	```{r}
	createColumnFilter('filter2', 'Mean.log2FC', c(1,5), "BETWEEN")
	```

	- You can combine filters to create a filter that matches any one of the filters.

	```{r}
	createCompositeFilter("Default filter", c("filter1", "filter2"), "ANY")
	```

	- No additional genes will be selected in this case (but this approach will be reused later).

	Only one gene shows significant average expression changes across all samples, RPS4Y1, a ribosomal protein that has been shown to be linked to glioblastoma, but it’s still a relatively unsatisfying result that only a single protein changes expression in glioblastoma patients.
	A closer examination might be in order...

	# Cluster to see expression patterns

	Let’s cluster all of the data to see if there are any significant patterns that have been averaged out by looking at all samples in the aggregate.
	clusterMaker2 is a Cytoscape app that provides a number of clustering algorithms including hierarchical and k-means.

	- Perform hierarchical clustering.

	```{r}
	cluster.cmd = 'cluster hierarchical metric=Pearson nodeAttributeList= showUI='
	commandsRun(cluster.cmd)
	```


	In Cytoscape, select Apps → clusterMaker → Hierarchical cluster. This will bring the settings for hierarchical clustering. Select Pearson correlation for Distance metric andSelect all of the samples (columns beginning with TCGA-) in the Node attributes for cluster list. The checkboxes Cluster attributes as well as nodes, Ignore nodes/edges with no data, and Show TreeView when complete should all be checked. Then click OK. The resulting TreeView should look like the figure on the next slide.