knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7.2, fig.height = 4.3, fig.retina = 2 )
aPEAR
is designed to help you notice the most important biological themes in your
enrichment analysis results. It analyses the gene lists of the pathways and detects
clusters of redundant overlapping gene sets.
Let's begin by performing a simple gene set enrichment analysis with clusterProfiler
:
# Load all the packages: library(data.table) library(ggplot2) library(dplyr) library(stringr) library(clusterProfiler) library(DOSE) library(org.Hs.eg.db) library(aPEAR) data(geneList) # Perform enrichment using clusterProfiler set.seed(42) enrich <- gseGO(geneList, OrgDb = org.Hs.eg.db, ont = 'CC')
enrichmentNetwork()
enrichmentNetwork
is the most important function exported by aPEAR
. It detects clusters of similar pathways
and generates a ggplot2
visualization. The only thing it asks you to provide is your enrichment result:
set.seed(654824) enrichmentNetwork(enrich@result)
Internally, enrichmentNetwork
calls two functions, findPathClusters
and plotPathClusters
, which
are described in more detail below.
clusterProfiler
?aPEAR
currently recognizes input from clusterProfiler
and gProfileR
. However, if you have custom enrichment
input, do not worry!
aPEAR
accepts any kind of enrichment input as long as it is formatted correctly, the only
requirement is that the gene list of each pathway is known. You should format your data so that:
data.frame
.colorBy
.nodeSize
.For example, you might format your data like this:
enrichmentData <- enrich@result %>% as.data.table() %>% .[ 1:5 ] %>% .[ , list(Description, pathwayGenes = core_enrichment, NES, Size = setSize) ] %>% .[ , pathwayGenes := str_trunc(pathwayGenes, 20) ]
enrichmentData[ 1:5 ]
enrichmentData <- enrich@result %>% as.data.table() %>% .[ , list(Description, pathwayGenes = core_enrichment, NES, Size = setSize) ]
Then, tell the enrichmentNetwork
what to do:
p <- enrichmentNetwork(enrichmentData, colorBy = 'NES', nodeSize = 'Size', verbose = TRUE)
Good news: you can use the p-values to color the nodes! Just specify the colorBy
column and colorType = 'pval'
:
set.seed(348934) enrichmentNetwork(enrich@result, colorBy = 'pvalue', colorType = 'pval', pCutoff = -5)
findPathClusters()
If your goal is only to obtain the clusters of redundant pathways, the function findPathClusters
is
the way to go. It accepts a data.frame
with the enrichment results and returns a list of the pathway clusters
and the similarity matrix:
clusters <- findPathClusters(enrich@result, cluster = 'hier', minClusterSize = 6) clusters$clusters[ 1:5 ] pathways <- clusters$clusters[ 1:5, Pathway ] clusters$similarity[ pathways, pathways ]
For more information about available similarity metrics, clustering methods, cluster naming conventions,
and other available parameters, see ?aPEAR.theme
.
plotPathClusters()
To visualize clustering results obtained with findPathClusters
, use the function plotPathClusters
:
set.seed(238923) plotPathClusters( enrichment = enrich@result, sim = clusters$similarity, clusters = clusters$clusters, fontSize = 4, outerCutoff = 0.01, # Decrease cutoff between clusters and show some connections drawEllipses = TRUE )
For more parameter options, see ?aPEAR.theme
.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.