SLAPE.core_components: Identification of core-component genes shared by multiple...

Description Usage Arguments Details Note Author(s) References See Also Examples

View source: R/SLAPenrich.R

Description

This function identifies group of genes that are recurrently altered in the analysed dataset and that are shared by multiple SLAPenriched pathways, thus are putatively leading the enrichment scores. Additionally this function generates pdf files containing pathway-membership heatmaps showing to which pathway each of the genes in the core-component belongs to, together with barplots with alteration frequencies for all the genes in the core-components. Results are also stored in individual Robjects.

Usage

1
2
SLAPE.core_components(PFP, EM, PATH = "./", fdrth = Inf, exclcovth = 0,
                      PATH_COLLECTION)

Arguments

PFP

A list containing the SLAPenrich analysis results outputted by the SLAPE.analyse function while analysing the genomic dataset summarised by the EM genomic event matrix.

EM

A sparse binary matrix, or a sparse matrix with integer non-null entries. In this matrix the column names are sample identifiers, and the row names official HUGO gene symbols. A non-zero entry in position i,j of this matrix indicates the presence of a somatic mutations harbored by the j-sample in the i-gene. This matrix must be the same that has been inputted to the SLAPE.analyse function to produce the results in the PFP list.

PATH

String specifiying the path of the directory where the pdf file shoud be created.

fdrth

The false discovery rate threshold that should be used to select SLAPenriched pathways from the PFP list (percentage). By default in the PFP with an FDR < 5% are selected.

exclcovth

The mutual exclusivity coverage threshold that should be used to select SLAPenriched pathways from the PFP list (percentage). By default in the PFP with an exclusive coverage > 50% are selected.

PATH_COLLECTION

The pathway collection that has been tested against the EM dataset with the SLAPE.analyse function to produce the PFP list of results.

Details

To identify shared core-components across significantly enriched pathways, the set of enriched pathways and their composing genes are modeled as a bipartite network, in which the first set of nodes contains one element per enriched pathway and the second set contains one element per each of the genes that are included in at least one of the enriched pathways.

In this network, a pathway node is connected with an edge to each of its composing gene nodes.

The resulting bipartite network is then mined for communities, i.e. groups of densely interconnected nodes by using a fast community detection algorithm based on a greedy strategy (Newman, 2004).

The resulting communities are finally saved into pdf files containing heatmaps where nodes in the first set (pathways) are on the columns by columns, nodes in the second set (genes) are on the rows and a not-empty cell in position i,j indicates that the i-th gene belongs to the j-th pathway.

Note

This function makes use of the fastgreedy.community function of the igraph R package (Csardi and Nepusz, 2006).

Author(s)

Francesco Iorio - iorio@ebi.ac.uk

References

Newman MEJ. Fast algorithm for detecting community structure in networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2004;69:066133.

Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal, Complex Systems. 2006;1695:38

See Also

SLAPE.analyse

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#Loading the Genomic Event data object derived from variants annotations
#identified in 188 Lung Adenocarcinoma patients (Ding et al, 2008)
data(LUAD_CaseStudy_ugs)

#Loading KEGG pathway gene-set collection data object
data(SLAPE.MSigDB_KEGG_hugoUpdated)

#Loading genome-wide total exonic block lengths
data(SLAPE.all_genes_exonic_content_block_lengths_ensemble)

#Running SLAPenrich analysis
PFPw<-SLAPE.analyse(EM = LUAD_CaseStudy_ugs,PATH_COLLECTION = KEGG_PATH,
                     show_progress = TRUE,
                     NSAMPLES = 1,
                     NGENES = 1,
                     accExLength = TRUE,
                     BACKGROUNDpopulation = rownames(LUAD_CaseStudy_ugs),
                     path_probability = 'Bernoulli',
                     GeneLenghts = GECOBLenghts)

#Generating pdf files containing heatmaps of the core-components
#of SLAPenriched pathway with an FDR < 5% and exclusive coverage > 80%.
#The pdf files are saved in the current working directory.
SLAPE.core_components(PFP=PFPw,
                      EM=LUAD_CaseStudy_ugs,
                      PATH='./LUAD_coreComponents_',
                      fdrth = 5,
                      exclcovth = 50,
                      PATH_COLLECTION = KEGG_PATH)

saezlab/SLAPenrich documentation built on May 29, 2019, 12:57 p.m.