MAGeCKFlute - Functional enrichment analysis in MAGeCKFlute

knitr::opts_chunk$set(tidy=FALSE, cache=TRUE,
                      dev="png", message=FALSE, error=FALSE, warning=TRUE)

Citation: if you use MAGeCKFlute in published research, please cite: Binbin Wang, Mei Wang, Wubing Zhang. "Integrative analysis of pooled CRISPR genetic screens using MAGeCKFlute." Nature Protocols (2019), doi: 10.1038/s41596-018-0113-7.t

Input data - weighted gene list

library(MAGeCKFlute)
file1 = file.path(system.file("extdata", package = "MAGeCKFlute"),
                  "testdata/rra.gene_summary.txt")
gdata = ReadRRA(file1)
genelist = gdata$Score
names(genelist) = gdata$id
genelist[1:5]

Enrichment analysis methods

MAGeCKFlute incorporates three enrichment methods, including Over-Representation Test (ORT), Gene Set Enrichment Analysis (GSEA), and Hypergeometric test (HGT). Here, ORT and GSEA are borrowed from R package clusterProfiler [@Yu2012].

Hypergeometric test

# Alternative functions EnrichAnalyzer and enrich.HGT.
hgtRes1 = EnrichAnalyzer(genelist[genelist< -1], method = "HGT")
head(hgtRes1@result)
# hgtRes2 = enrich.HGT(genelist[genelist< -1])
# head(hgtRes2@result)

Over-representation test

# Alternative functions EnrichAnalyzer and enrich.ORT.
ortRes1 = EnrichAnalyzer(genelist[genelist< -1], method = "ORT")
head(ortRes1@result)
# ortRes2 = enrich.ORT(genelist[genelist< -1])
# head(ortRes2@result)

Gene set enrichment analysis

# Alternative functions EnrichAnalyzer and enrich.GSE.
gseRes1 = EnrichAnalyzer(genelist, method = "GSEA")
head(gseRes1@result)
# gseRes2 = enrich.GSE(genelist)
# head(gseRes2@result)

Visualize enrichment results.

Barplot

require(ggplot2)
df = hgtRes1@result
df$logFDR = -log10(df$p.adjust)
p = BarView(df[1:5,], "Description", 'logFDR')
p = p + labs(x = NULL) + coord_flip()
p

# Or use function barplot from enrichplot package
barplot(hgtRes1, showCategory = 5)

Dot plot

## top: up-regulated pathways; 
## bottom: down-regulated pathways
EnrichedView(hgtRes1, top = 0, bottom = 5, mode = 1)
EnrichedView(hgtRes1, top = 0, bottom = 5, mode = 2)
dotplot(hgtRes1, showCategory = 5)

Other visualization functions from enrichplot [@Yu2018].

hgtRes1@result$geneID = hgtRes1@result$geneName
cnetplot(hgtRes1, 2)
heatplot(hgtRes1, showCategory = 3, foldChange=genelist)
emapplot(hgtRes1, layout="kk")

Visulization for GSEA enriched categories

#gseaplot
gseaplot(gseRes1, geneSetID = 1, title = gseRes1$Description[1])
gseaplot(gseRes1, geneSetID = 1, by = "runningScore", title = gseRes1$Description[1])
gseaplot(gseRes1, geneSetID = 1, by = "preranked", title = gseRes1$Description[1])
#or
gseaplot2(gseRes1, geneSetID = 1:3)

Type of gene sets for enrichment analysis

For enrichment analysis, MAGeCKFlute signifies the public available gene sets, including Pathways (PID, KEGG, REACTOME, BIOCARTA, C2CP), GO terms (GOBP, GOCC, GOMF), Complexes (CORUM) and molecular signature from MsigDB (c1, c2, c3, c4, c5, c6, c7, HALLMARK).

Functional enrichment analysis on GO terms and pathways

Analysis of high-throughput data increasingly relies on pathway annotation and functional information derived from Gene Ontology, which is also useful in the analysis of CRISPR screens.

## KEGG and REACTOME pathways
enrich = EnrichAnalyzer(geneList = genelist[genelist< -1], type = "KEGG+REACTOME")
EnrichedView(enrich, bottom = 5)
## Only KEGG pathways
enrich = EnrichAnalyzer(geneList = genelist[genelist< -1], type = "KEGG")
EnrichedView(enrich, bottom = 5)
## Gene ontology
enrichGo = EnrichAnalyzer(genelist[genelist< -1], type = "GOBP+GOMF")
EnrichedView(enrichGo, bottom = 5)

Protein complex analysis

Functional annotations from the pathways and GO are powerful in the context of network dynamics. However, the approach has limitations in particular for the analysis of CRISPR screenings, in which elements within a protein complex rather than complete pathways might have a strong selection. So we incorporate protein complex resource from CORUM database, which enable identification of essential protein complexes from the CRISPR screens.

enrichPro = EnrichAnalyzer(genelist[genelist< -1], type = "CORUM")
EnrichedView(enrichPro, bottom = 5)

Enrichment analysis on the combination of the gene sets

enrichComb = EnrichAnalyzer(genelist[genelist< -1], type = "GOBP+KEGG")
EnrichedView(enrichComb, bottom = 5)

Limit the size of gene sets for testing

enrich = EnrichAnalyzer(genelist[genelist< -1], type = "GOBP", limit = c(50, 500))
EnrichedView(enrich, bottom = 5)

Remove redundant results using EnrichedFilter.

enrich1 = EnrichAnalyzer(genelist[genelist< -1], type = "GOMF+GOBP")
enrich2 = EnrichAnalyzer(genelist[genelist< -1], type = "GOMF+GOBP", filter = TRUE)
enrich3 = EnrichedFilter(enrich1)
EnrichedView(enrich1, bottom = 15)
EnrichedView(enrich2, bottom = 15)
EnrichedView(enrich3, bottom = 15)

Session info

sessionInfo()

References



Try the MAGeCKFlute package in your browser

Any scripts or data that you put into this service are public.

MAGeCKFlute documentation built on Nov. 8, 2020, 5:40 p.m.