xPierGSEA: Function to prioritise pathways based on GSEA analysis of...

Description Usage Arguments Value Note See Also Examples

View source: R/xPierGSEA.r

Description

xPierGSEA is supposed to prioritise pathways given prioritised genes and the ontology in query. It is done via gene set enrichment analysis (GSEA). It returns an object of class "eGSEA".

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
xPierGSEA(
pNode,
priority.top = NULL,
ontology = c("GOBP", "GOMF", "GOCC", "PS", "PS2", "SF", "Pfam", "DO",
"HPPA", "HPMI",
"HPCM", "HPMA", "MP", "EF", "MsigdbH", "MsigdbC1", "MsigdbC2CGP",
"MsigdbC2CPall",
"MsigdbC2CP", "MsigdbC2KEGG", "MsigdbC2REACTOME", "MsigdbC2BIOCARTA",
"MsigdbC3TFT",
"MsigdbC3MIR", "MsigdbC4CGN", "MsigdbC4CM", "MsigdbC5BP", "MsigdbC5MF",
"MsigdbC5CC",
"MsigdbC6", "MsigdbC7", "DGIdb", "GTExV4", "GTExV6p", "GTExV7",
"CreedsDisease",
"CreedsDiseaseUP", "CreedsDiseaseDN", "CreedsDrug", "CreedsDrugUP",
"CreedsDrugDN",
"CreedsGene", "CreedsGeneUP", "CreedsGeneDN", "KEGG", "KEGGmetabolism",
"KEGGgenetic", "KEGGenvironmental", "KEGGcellular", "KEGGorganismal",
"KEGGdisease"),
customised.genesets = NULL,
size.range = c(10, 500),
p.adjust.method = c("BH", "BY", "bonferroni", "holm", "hochberg",
"hommel"),
path.mode = c("all_paths", "shortest_paths", "all_shortest_paths"),
weight = 1,
seed = 825,
nperm = 2000,
fast = TRUE,
verbose = TRUE,
silent = FALSE,
RData.location = "http://galahad.well.ox.ac.uk/bigdata",
guid = NULL
)

Arguments

pNode

an object of class "pNode" (or "sTarget" or "dTarget"). Alternatively, it can be a data frame with two columns ('priority' and 'rank')

priority.top

the number of the top targets used for GSEA. By default, it is NULL meaning all targets are used

ontology

the ontology supported currently. It can be "GOBP" for Gene Ontology Biological Process, "GOMF" for Gene Ontology Molecular Function, "GOCC" for Gene Ontology Cellular Component, "PS" for phylostratific age information, "PS2" for the collapsed PS version (inferred ancestors being collapsed into one with the known taxonomy information), "SF" for SCOP domain superfamilies, "Pfam" for Pfam domain families, "DO" for Disease Ontology, "HPPA" for Human Phenotype Phenotypic Abnormality, "HPMI" for Human Phenotype Mode of Inheritance, "HPCM" for Human Phenotype Clinical Modifier, "HPMA" for Human Phenotype Mortality Aging, "MP" for Mammalian Phenotype, "EF" for Experimental Factor Ontology (used to annotate GWAS Catalog genes), Drug-Gene Interaction database ("DGIdb") for drugable categories, tissue-specific eQTL-containing genes from GTEx ("GTExV4", "GTExV6p" and "GTExV7"), crowd extracted expression of differential signatures from CREEDS ("CreedsDisease", "CreedsDiseaseUP", "CreedsDiseaseDN", "CreedsDrug", "CreedsDrugUP", "CreedsDrugDN", "CreedsGene", "CreedsGeneUP" and "CreedsGeneDN"), KEGG pathways (including 'KEGG' for all, 'KEGGmetabolism' for 'Metabolism' pathways, 'KEGGgenetic' for 'Genetic Information Processing' pathways, 'KEGGenvironmental' for 'Environmental Information Processing' pathways, 'KEGGcellular' for 'Cellular Processes' pathways, 'KEGGorganismal' for 'Organismal Systems' pathways, and 'KEGGdisease' for 'Human Diseases' pathways), and the molecular signatures database (Msigdb, including "MsigdbH", "MsigdbC1", "MsigdbC2CGP", "MsigdbC2CPall", "MsigdbC2CP", "MsigdbC2KEGG", "MsigdbC2REACTOME", "MsigdbC2BIOCARTA", "MsigdbC3TFT", "MsigdbC3MIR", "MsigdbC4CGN", "MsigdbC4CM", "MsigdbC5BP", "MsigdbC5MF", "MsigdbC5CC", "MsigdbC6", "MsigdbC7")

customised.genesets

a list each containing gene symbols. By default, it is NULL. If the list provided, it will overtake the previous parameter "ontology"

size.range

the minimum and maximum size of members of each term in consideration. By default, it sets to a minimum of 10 but no more than 500

p.adjust.method

the method used to adjust p-values. It can be one of "BH", "BY", "bonferroni", "holm", "hochberg" and "hommel". The first two methods "BH" (widely used) and "BY" control the false discovery rate (FDR: the expected proportion of false discoveries amongst the rejected hypotheses); the last four methods "bonferroni", "holm", "hochberg" and "hommel" are designed to give strong control of the family-wise error rate (FWER). Notes: FDR is a less stringent condition than FWER

path.mode

the mode of paths induced by vertices/nodes with input annotation data. It can be "all_paths" for all possible paths to the root, "shortest_paths" for only one path to the root (for each node in query), "all_shortest_paths" for all shortest paths to the root (i.e. for each node, find all shortest paths with the equal lengths)

weight

an integer specifying score weight. It can be "0" for unweighted (an equivalent to Kolmogorov-Smirnov, only considering the rank), "1" for weighted by input gene score (by default), and "2" for over-weighted, and so on

seed

an integer specifying the seed

nperm

the number of random permutations. For each permutation, gene-score associations will be permutated so that permutation of gene-term associations is realised

fast

logical to indicate whether to fast calculate GSEA resulting. By default, it sets to true, but not necessarily does so. It will depend on whether the package "fgsea" has been installed

verbose

logical to indicate whether the messages will be displayed in the screen. By default, it sets to true

silent

logical to indicate whether the messages will be silent completely. By default, it sets to false. If true, verbose will be forced to be false

RData.location

the characters to tell the location of built-in RData files. See xRDataLoader for details

guid

a valid (5-character) Global Unique IDentifier for an OSF project. See xRDataLoader for details

Value

an object of class "eGSEA", a list with following components:

Note

none

See Also

xSymbol2GeneID, xRDataLoader, xDAGanno

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
RData.location <- "http://galahad.well.ox.ac.uk/bigdata"
## Not run: 
# a) provide the seed nodes/genes with the weight info
## load ImmunoBase
ImmunoBase <- xRDataLoader(RData.customised='ImmunoBase',
RData.location=RData.location)
## get genes within 500kb away from AS GWAS lead SNPs
seeds.genes <- ImmunoBase$AS$genes_variants
## seeds weighted according to distance away from lead SNPs
data <- 1- seeds.genes/500000

# b) perform priority analysis
pNode <- xPierGenes(data=data, network="PCommonsDN_medium",restart=0.7,
RData.location=RData.location)

# c) do pathway-level priority using GSEA
eGSEA <- xPierGSEA(pNode=pNode, ontology="DGIdb", nperm=2000,
RData.location=RData.location)
bp <- xGSEAbarplot(eGSEA, top_num="auto", displayBy="nes")
gp <- xGSEAdotplot(eGSEA, top=1)

## End(Not run)

Pi documentation built on Nov. 29, 2021, 3 p.m.