analyze: Gene Set Collection Analysis or NetWork Analysis

Description Usage Arguments Value Methods (by class) References Examples

Description

This is a generic function.

Usage

1
2
3
4
5
6
7
8
## S4 method for signature 'GSCA'
analyze(object, para = list(pValueCutoff = 0.05,
  pAdjustMethod = "BH", nPermutations = 1000, minGeneSetSize = 15, exponent =
  1), verbose = TRUE, doGSOA = FALSE, doGSEA = TRUE,
  GSEA.by = "HTSanalyzeR2")

## S4 method for signature 'NWA'
analyze(object, fdr = 0.001, species, verbose = TRUE)

Arguments

object

An object. When this function is implemented as the S4 method of class 'GSCA' or 'NWA', this argument is an object of class 'GSCA' or 'NWA'.

para

A list of parameters for GSEA and hypergeometric tests. This argument is only for class GSCA.

verbose

A single logical value specifying to display detailed messages (when verbose=TRUE) or not (when verbose=FALSE).

doGSOA

A single logical value specifying to perform gene set overrepresentation analysis(hypergeometric test) (when doGSOA=TRUE) or not (when doGSOA=FALSE).

doGSEA

A single logical value specifying to perform gene set enrichment analysis (when doGSEA=TRUE) or not (when doGSEA=FALSE).

GSEA.by

A single character value to choose which algorithm to do GSEA. Valid value could either be "HTSanalyzeR2"(default) or "fgsea". If performed by "fgsea", the result explanation please refer to fgsea.

fdr

A single numeric value specifying the false discovery for the scoring of nodes (see BioNet::scoreNodes and Dittrich et al., 2008 for details)

species

A single character value specifying the species for which the data should be read.

pValueCutoff

A single numeric value specifying the cutoff for adjusted p-values considered significant.

pAdjustMethod

A single character value specifying the p-value adjustment method to be used (see 'p.adjust' for details).

nPermutations

A single integer or numeric value specifying the number of permutations for deriving p-values in GSEA.

minGeneSetSize

A single integer or numeric value specifying the minimum number of elements shared by a gene set and the input total genes. Gene sets with fewer than this number are removed from both hypergeometric analysis and GSEA.

exponent

A single integer or numeric value used in weighting phenotypes in GSEA.

Value

In the end, this function will return an updated object of class GSCA or NWA. All the analyzed results could be found in slot result.

Methods (by class)

References

Aravind Subramanian, Pablo Tamayo, Vamsi K. Mootha, Sayan Mukherjee, Benjamin L. Ebert, Michael A. Gillette, Amanda Paulovich, Scott L. Pomeroy, Todd R. Golub, Eric S. Lander, and Jill P. Mesirov Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles PNAS 2005 102 (43) 15545-15550; published ahead of print September 30, 2005, doi:10.1073/pnas.0506580102

Beisser D, Klau GW, Dandekar T, Muller T, Dittrich MT. BioNet: an R-Package for the functional analysis of biological networks. Bioinformatics. 2010 Apr 15;26(8):1129-30.

Dittrich MT, Klau GW, Rosenwald A., Dandekar T and Muller T. Identifying functional modules in protein-protein interaction networks: an integrated exact approach. Bioinformatics 2008 24(13):i223-i231.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
# ====================================================
# Gene Set Collection Analysis Part
library(org.Hs.eg.db)
library(GO.db)
library(KEGGREST)
## load data for enrichment analyse
data(d7)
phenotype <- as.vector(d7$neg.lfc)
names(phenotype) <- d7$id

## select hits if you also want to do GSOA, otherwise ignore it
hits <-  names(phenotype[which(abs(phenotype) > 2)])

## set up a list of gene set collections
GO_MF <- GOGeneSets(species="Hs", ontologies=c("MF"))
PW_KEGG <- KeggGeneSets(species="Hs")
ListGSC <- list(GO_MF=GO_MF, PW_KEGG=PW_KEGG)

## create an object of class 'GSCA'
gsca <- GSCA(listOfGeneSetCollections = ListGSC, geneList = phenotype, hits = hits)

## do preprocessing
gsca1 <- preprocess(gsca, species="Hs", initialIDs="SYMBOL", keepMultipleMappings=TRUE,
                    duplicateRemoverMethod="max", orderAbsValue=FALSE)

## support parallel calculation using doParallel package
if (requireNamespace("doParallel", quietly=TRUE)) {
doParallel::registerDoParallel(cores=2)
} else {
}

## do hypergeometric tests and GSEA
gsca2 <- analyze(gsca1, para=list(pValueCutoff=0.01, pAdjustMethod ="BH",
                                  nPermutations=100, minGeneSetSize=10, exponent=1),
                                  doGSOA = TRUE, doGSEA = TRUE)

## summarize gsca2 and get results
summarize(gsca2)
head(getResult(gsca2)$GSEA.results$GO_MF)
head(getResult(gsca2)$HyperGeo.results$PW_KEGG)

# ===========================================================
## Not run: 
# Network Analysis
library(org.Hs.eg.db)
library(GO.db)
## load data for network analyse
data(d7)
pvalues <- d7$neg.p.value
names(pvalues) <- d7$id

## input phenotypes if you want to color nodes by it
phenotypes <- as.vector(d7$neg.lfc)
names(phenotypes) <- d7$id

## create an object of class 'NWA' with phenotypes
nwa <- NWA(pvalues=pvalues, phenotypes=phenotypes)

## do preprocessing
nwa1 <- preprocess(nwa, species="Hs", initialIDs="SYMBOL", keepMultipleMappings=TRUE,
                   duplicateRemoverMethod="max")

## create an interactome for nwa1 by downloading from BioGRID database
nwa2 <- interactome(nwa1, species="Hs", reportDir="HTSanalyzerReport", genetic=FALSE)

## analyze
nwa3 <- analyze(nwa2, fdr=0.0001, species="Hs")
## summarize nwa3
summarize(nwa3)

## End(Not run)

CityUHK-CompBio/HTSanalyzeR2 documentation built on Aug. 28, 2018, 1:19 a.m.