Plot the Phylostratum or Divergence Stratum Enrichment of a given Gene Set

Description

This function computes and visualizes the significance of enriched (over or underrepresented) Phylostrata or Divergence Strata within an input test.set.

Usage

1
2
3
4
5
PlotEnrichment(ExpressionSet, test.set, use.only.map = FALSE,
  measure = "log-foldchange", complete.bg = TRUE, legendName = "",
  over.col = "steelblue", under.col = "midnightblue", epsilon = 1e-05,
  cex.legend = 1, cex.asterisk = 1, plot.bars = TRUE,
  p.adjust.method = NULL, ...)

Arguments

ExpressionSet

a standard PhyloExpressionSet or DivergenceExpressionSet object (in case only.map = FALSE).

test.set

a character vector storing the gene ids for which PS/DS enrichment analyses should be performed.

use.only.map

a logical value indicating whether instead of a standard ExpressionSet only a Phylostratigraphic Map or Divergene Map is passed to this function.

measure

a character string specifying the measure that should be used to quantify over and under representation of PS/DS. Measures can either be measure = "foldchange" (odds) or measure = "log-foldchange" (log-odds).

complete.bg

a logical value indicating whether the entire background set of the input ExpressionSet should be considered when performing Fisher's exact test (complete.bg = TRUE) or whether genes that are stored in test.set should be excluded from the background set before performing Fisher's exact test (complete.bg = FALSE).

legendName

a character string specifying whether "PS" or "DS" are used to compute relative expression profiles.

over.col

color of the overrepresentation bars.

under.col

color of the underrepresentation bars.

epsilon

a small value to shift values by epsilon to avoid log(0) computations.

cex.legend

the cex value for the legend.

cex.asterisk

the cex value for the asterisk.

plot.bars

a logical value specifying whether or not bars should be visualized or whether only p.values and enrichment.matrix should be returned.

p.adjust.method

correction method to adjust p-values for multiple comparisons (see p.adjust for possible methods). E.g., p.adjust.method = "BH" (Benjamini & Hochberg (1995)) or p.adjust.method = "bonferroni" (Bonferroni correction).

...

default graphics parameters.

Details

This Phylostratum or Divergence Stratum enrichment analysis is motivated by Sestak and Domazet-Loso (2015) who perform Phylostratum or Divergence Stratum enrichment analyses to correlate organ evolution with the origin of organ specific genes.

In detail this function takes the Phylostratum or Divergence Stratum distribution of all genes stored in the input ExpressionSet as background set and the Phylostratum or Divergence Stratum distribution of the test.set and performes a fisher.test for each Phylostratum or Divergence Stratum to quantify the statistical significance of over- or underrepresentated Phylostrata or Divergence Strata within the set of selected test.set genes.

To visualize the odds or log-odds of over or underrepresented genes within the test.set the following procedure is performed:

  • N_ij denotes the number of genes in group j and deriving from PS i, with i = 1, .. , n and where j = 1 denotes the background set and j = 2 denotes the test.set

  • N_i. denotes the total number of genes within PS i

  • N_.j denotes the total number of genes within group j

  • N_.. is the total number of genes within all groups j and all PS i

  • f_ij = N_ij / N_.. and g_ij = f_ij / f_.j denote relative frequencies between groups

  • f_i. denotes the between group sum of f_ij

The result is the fold-change value (odds) denoted as C = g_i2 / f_i. which is visualized above and below zero.

In case a large number of Phylostrata or Divergence Strata is included in the input ExpressionSet, p-values returned by PlotEnrichment should be adjusted for multiple comparisons which can be done by specifying the p.adjust.method argument.

Author(s)

Hajk-Georg Drost

References

Sestak and Domazet-Loso (2015). Phylostratigraphic Profiles in Zebrafish Uncover Chordate Origins of the Vertebrate Brain. Mol. Biol. Evol. 32(2): 299-312.

See Also

EnrichmentTest, fisher.test

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
data(PhyloExpressionSetExample)

set.seed(123)
test_set <- sample(PhyloExpressionSetExample[ , 2],10000)

## Examples with complete.bg = TRUE
## Hence: the entire background set of the input ExpressionSet is considered 
## when performing Fisher's exact test 

# measure: log-foldchange
PlotEnrichment(ExpressionSet = PhyloExpressionSetExample,
               test.set      = test_set , 
               legendName    = "PS", 
               measure       = "log-foldchange")
               
               
# measure: foldchange
PlotEnrichment(ExpressionSet = PhyloExpressionSetExample,
               test.set      = test_set , 
               legendName    = "PS", 
               measure       = "foldchange")
   
               
## Examples with complete.bg = FALSE
## Hence: the test.set genes are excluded from the background set before
## Fisher's exact test is performed
     
                                       
# measure: log-foldchange
PlotEnrichment(ExpressionSet = PhyloExpressionSetExample,
               test.set      = test_set ,
                complete.bg  = FALSE,
               legendName    = "PS", 
               measure       = "log-foldchange")
               
               
# measure: foldchange
PlotEnrichment(ExpressionSet = PhyloExpressionSetExample,
               test.set      = test_set , 
               complete.bg   = FALSE,
               legendName    = "PS", 
               measure       = "foldchange")     
               

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.