Abstract

Enrichment analysis is a popular method for analyzing gene sets generated from transcriptomes and proteomics. Here, we developed Key2Enrich, a novel all-in-one R package for gene set enrichment analysis. It combines the whole pipeline of enrichment analysis in one package, provides enrichment function and visualization methods, while it also maps and renders input sample data on relevant pathway graphs. Key2Enrich automatically downloads the pathway graph data, parses the data file, maps and integrates the genes of input data onto the pathway and renders pathway graphs. Currently, Key2Enrich supports three species, including human, mice and rat. Both enrichment analysis and pathway mapping functions can be integrated with other functional annotation tools for large-scale and fully automated analysis pipelines.

Get started

library(Key2Enrich)

Load sample data from packages, the input file should in csv format. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The data from Affymetrix microarray of western diet feed mouse (WD) vs. low fat diet feed mouse (LF) were used as an example. 748 differential expressed genes were identified with cutoff FDR<0.05 using limma R package.

# Load sample data.
data(inputSample)
inputSample<-as.data.frame(inputSample)
head(inputSample)

KEGG pathway enrichment analysis, this step will take some time, the data is getting from KEGG database by using KEGG API.

thisKEGGEnricherMatrix<-KEGGEnricherMatrixRealTime(inputSample,"mouse","fdr","fdr",0.05) colnames(thisKEGGEnricherMatrix) head(thisKEGGEnricherMatrix[c(1:6,11:13)])

Then the result can be show in bar plot. Key2EnrichBarplot(thisKEGGEnricherMatrix,"KEGG",15,20)

knitr::include_graphics('../inst/extdata/figure/KEGG_barplot.png')

These step can be done by using one function. The figure sizes have been customised so that you can easily put two images side-by-side. The genes can mapped to the nodes of KEGG pathway by using the parameter ifMapOnPath=TRUE.

KEGGSigMx<-Key2KEGGEnrich(inputSample=inputSample,inputSpecies="mouse", adjustMethod="fdr",filterMethod="fdr", filterValue=0.05,ifbarplot=TRUE, ifMapOnPath=TRUE,type="KEGG", imgWidth=15,imgHeight=20) head(KEGGSigMx[c(1:6,11:13)])

knitr::include_graphics('../inst/extdata/figure/KEGG_barplot.png')
knitr::include_graphics('../inst/extdata/figure/KEGG_mapping1.png')
knitr::include_graphics('../inst/extdata/figure/KEGG_mapping2.png')

Enrichment analysis based on Reactome pathways. reactomePValueMatrix<-Reactome2Enrich(geneIDs=as.character(inputSample$entrezgene),"mouse","fdr",0.05) colnames(reactomePValueMatrix) head(reactomePValueMatrix[,1:6])

knitr::include_graphics('../inst/extdata/figure/reactome_barplot.png')

Enrichment analysis based on terms of gene ontology. egoDF<-GO2Enrich(geneIDs=as.character(inputSample$entrezgene),"mouse","fdr",0.05,"CC") colnames(egoDF) egoDF[,c(1:3,5:7)]

knitr::include_graphics('../inst/extdata/figure/CC.png')

Functions can be called by other packages

The gene list can be mapped to specific KEGG pathway. Key2Enrich can intergate with other genome-wide data analysis tools or packages. mappingOnSpecifiedKEGGPathway("mmu04976",inputSample,"mouse")

knitr::include_graphics('../inst/extdata/figure/KEGG_mapping3.png')

Get KEGG pathname and pathway ID of a specific species.

mmuInfo<-getAllPathNameAndID("mmu")
head(mmuInfo)

Get all KEGG gene ID and entrezgene ID of a specific species.

mmuGeneInfo<-KEGGID2EntrezID("mmu")
head(mmuGeneInfo)

Get all KEGG pathway name of a specific species.

allMmuPath<-getTotalPathNames("mmu")
head(allMmuPath)

Calculate p-value and adjust p-value for a specific KEGG pathway on input sample.

allGeneInPathwayDF<-getAllGeneInPathwayDF("mmu")
N<-getN(allGeneInPathwayDF)
n<-get_n(inputSample,allGeneInPathwayDF)

pValueMX<-getPValue("mmu00053","mmu",inputSample,N,n)
pValueMX[1:6]

Get all KEGG pathway ID and its corresponding gene list in KEGG gene ID format.

pathIDandKEGG_geneID<-getAllGeneInPathwayDF("mmu")
head(pathIDandKEGG_geneID)

Parse KEGG xml file.

path<-system.file(package = "Key2Enrich")
xmlPath<-paste(path,"/extdata",sep="")
xmlFile<-paste(xmlPath,sep="","/hsa04012.xml")
r<-parseKGMLFile(xmlFile)
kegg.nodes <- lapply(r[childIsEntry(r)], parseEntry)
nodeDF<-data.frame(KEGG_GeneID="NA",type=NA,link=NA,graphicName=NA,
                  graphicType=NA,graphicX=NA,graphicY=NA,
                  graphicWidth=NA,graphicHeight=NA)
nodeDF<-parseList2Dataframe(kegg.nodes,nodeDF)
head(nodeDF)

Key2Enrich support gene set identifiers in GSEABase class.

data(gs)
library(GSEABase)
reactomePValueMatrix<-Reactome2Enrich(geneIds(gs),"mouse","fdr",0.05)

egoDF<-GO2Enrich(geneIds(gs),"mouse","fdr",0.05,"CC")

SessionInfo

sessionInfo()


ppdragondw/Key2Enrich documentation built on May 29, 2019, 7:39 a.m.