An introduction to _aPEAR_

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7.2,
  fig.height = 4.3,
  fig.retina = 2
)

aPEAR is designed to help you notice the most important biological themes in your enrichment analysis results. It analyses the gene lists of the pathways and detects clusters of redundant overlapping gene sets.

Let's begin by performing a simple gene set enrichment analysis with clusterProfiler:

# Load all the packages:
library(data.table)
library(ggplot2)
library(dplyr)
library(stringr)
library(clusterProfiler)
library(DOSE)
library(org.Hs.eg.db)
library(aPEAR)
data(geneList)

# Perform enrichment using clusterProfiler
set.seed(42)
enrich <- gseGO(geneList, OrgDb = org.Hs.eg.db, ont = 'CC')

Generate an enrichment network with enrichmentNetwork()

enrichmentNetwork is the most important function exported by aPEAR. It detects clusters of similar pathways and generates a ggplot2 visualization. The only thing it asks you to provide is your enrichment result:

set.seed(654824)
enrichmentNetwork(enrich@result)

Internally, enrichmentNetwork calls two functions, findPathClusters and plotPathClusters, which are described in more detail below.

What if I performed my enrichment analysis using another method, not clusterProfiler?

aPEAR currently recognizes input from clusterProfiler and gProfileR. However, if you have custom enrichment input, do not worry!

aPEAR accepts any kind of enrichment input as long as it is formatted correctly, the only requirement is that the gene list of each pathway is known. You should format your data so that:

For example, you might format your data like this:

enrichmentData <- enrich@result %>%
  as.data.table() %>%
  .[ 1:5 ] %>%
  .[ , list(Description, pathwayGenes = core_enrichment, NES, Size = setSize) ] %>%
  .[ , pathwayGenes := str_trunc(pathwayGenes, 20) ]
enrichmentData[ 1:5 ]
enrichmentData <- enrich@result %>%
  as.data.table() %>%
  .[ , list(Description, pathwayGenes = core_enrichment, NES, Size = setSize) ]

Then, tell the enrichmentNetwork what to do:

p <- enrichmentNetwork(enrichmentData, colorBy = 'NES', nodeSize = 'Size', verbose = TRUE)

What if I performed ORA and do not have the normalized enrichment score (NES)?

Good news: you can use the p-values to color the nodes! Just specify the colorBy column and colorType = 'pval':

set.seed(348934)
enrichmentNetwork(enrich@result, colorBy = 'pvalue', colorType = 'pval', pCutoff = -5)

Find pathway clusters with findPathClusters()

If your goal is only to obtain the clusters of redundant pathways, the function findPathClusters is the way to go. It accepts a data.frame with the enrichment results and returns a list of the pathway clusters and the similarity matrix:

clusters <- findPathClusters(enrich@result, cluster = 'hier', minClusterSize = 6)

clusters$clusters[ 1:5 ]

pathways <- clusters$clusters[ 1:5, Pathway ]
clusters$similarity[ pathways, pathways ]

For more information about available similarity metrics, clustering methods, cluster naming conventions, and other available parameters, see ?aPEAR.theme.

Visualize pathway clusters with plotPathClusters()

To visualize clustering results obtained with findPathClusters, use the function plotPathClusters:

set.seed(238923)
plotPathClusters(
  enrichment = enrich@result,
  sim = clusters$similarity,
  clusters = clusters$clusters,
  fontSize = 4,
  outerCutoff = 0.01, # Decrease cutoff between clusters and show some connections
  drawEllipses = TRUE
)

For more parameter options, see ?aPEAR.theme.



Try the aPEAR package in your browser

Any scripts or data that you put into this service are public.

aPEAR documentation built on July 9, 2023, 6:16 p.m.