# Perform the Reductive Early Conservation Test

### Description

The *Reductive Early Conservation Test* aims to statistically evaluate the
existence of a monotonically increasing phylotranscriptomic pattern based on `TAI`

or `TDI`

computations.
The corresponding p-value quantifies the probability that a given TAI or TDI pattern (or any phylotranscriptomics pattern)
does not follow an early conservation like pattern. A p-value < 0.05 indicates that the corresponding phylotranscriptomics pattern does
indeed follow an early conservation (low-high-high) shape.

### Usage

1 2 3 |

### Arguments

`ExpressionSet` |
a standard PhyloExpressionSet or DivergenceExpressionSet object. |

`modules` |
a list storing three elements: early, mid, and late. Each element expects a numeric
vector specifying the developmental stages or experiments that correspond to each module.
For example, |

`permutations` |
a numeric value specifying the number of permutations to be performed for the |

`lillie.test` |
a boolean value specifying whether the Lilliefors Kolmogorov-Smirnov Test shall be performed to quantify the goodness of fit. |

`plotHistogram` |
a boolean value specifying whether a |

`runs` |
specify the number of runs to be performed for goodness of fit computations, in case |

`parallel` |
performing |

`gof.warning` |
a logical value indicating whether non significant goodness of fit results should be printed as warning. Default is |

`custom.perm.matrix` |
a custom |

### Details

The *reductive early conservation test* is a permutation test based on the following test statistic.

(1) A set of developmental stages is partitioned into three modules - early, mid, and late - based on prior biological knowledge.

(2) The mean `TAI`

or `TDI`

value for each of the three modules T_early, T_mid, and T_late are computed.

(3) The two differences D1 = T_mid - T_early and D2 = T_late - T_early are calculated.

(4) The minimum D_min of D1 and D2 is computed as final test statistic of the reductive hourglass test.

In order to determine the statistical significance of an observed minimum difference D_min
the following permutation test was performed. Based on the `bootMatrix`

D_min
is calculated from each of the permuted `TAI`

or `TDI`

profiles,
approximated by a Gaussian distribution with method of moments estimated parameters returned by `fitdist`

,
and the corresponding p-value is computed by `pnorm`

given the estimated parameters of the Gaussian distribution.
The *goodness of fit* for the random vector *D_min* is statistically quantified by an Lilliefors (Kolmogorov-Smirnov) test
for normality.

In case the parameter *plotHistogram = TRUE*, a multi-plot is generated showing:

(1) A Cullen and Frey skewness-kurtosis plot generated by `descdist`

.
This plot illustrates which distributions seem plausible to fit the resulting permutation vector D_min.
In the case of the *reductive early conservation test* a normal distribution seemed plausible.

(2) A histogram of D_min combined with the density plot is plotted. D_min is then fitted by a normal distribution.
The corresponding parameters are estimated by *moment matching estimation* using the `fitdist`

function.

(3) A plot showing the p-values for N independent runs to verify that a specific p-value is biased by a specific permutation order.

(4) A barplot showing the number of cases in which the underlying goodness of fit (returned by Lilliefors (Kolmogorov-Smirnov) test
for normality) has shown to be significant (`TRUE`

) or not significant (`FALSE`

).
This allows to quantify the permutation bias and their implications on the goodness of fit.

### Value

a list object containing the list elements:

`p.value`

: the p-value quantifying the statistical significance (low-high-high pattern) of the given phylotranscriptomics pattern.

`std.dev`

: the standard deviation of the N sampled phylotranscriptomics patterns for each developmental stage S.

`lillie.test`

: a boolean value specifying whether the *Lillifors KS-Test* returned a p-value > 0.05,
which indicates that fitting the permuted scores with a normal distribution seems plausible.

### Author(s)

Hajk-Georg Drost

### References

Drost HG et al. (2015) *Evidence for Active Maintenance of Phylotranscriptomic Hourglass Patterns in Animal and Plant Embryogenesis*. Mol Biol Evol. 32 (5): 1221-1231 doi:10.1093/molbev/msv012.

Quint M et al. (2012). *A transcriptomic hourglass in plant embryogenesis*. Nature (490): 98-101.

Piasecka B, Lichocki P, Moretti S, et al. (2013) *The hourglass and the early conservation models co-existing
patterns of developmental constraints in vertebrates*. PLoS Genet. 9(4): e1003476.

### See Also

`ecScore`

, `bootMatrix`

, `FlatLineTest`

,`ReductiveHourglassTest`

, `PlotPattern`

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | ```
data(PhyloExpressionSetExample)
# perform the early conservation test for a PhyloExpressionSet
# here the prior biological knowledge is that stages 1-2 correspond to module 1 = early,
# stages 3-5 to module 2 = mid (phylotypic module), and stages 6-7 correspond to
# module 3 = late
EarlyConservationTest(PhyloExpressionSetExample,
modules = list(early = 1:2, mid = 3:5, late = 6:7),
permutations = 1000)
# use your own permutation matrix based on which p-values (EarlyConservationTest)
# shall be computed
custom_perm_matrix <- bootMatrix(PhyloExpressionSetExample,100)
EarlyConservationTest(PhyloExpressionSetExample,
modules = list(early = 1:2, mid = 3:5, late = 6:7),
custom.perm.matrix = custom_perm_matrix)
``` |