EarlyConservationTest: Perform Reductive Early Conservation Test
In myTAI: Evolutionary Transcriptomics

Description Usage Arguments Details Value Author(s) References See Also Examples

The Reductive Early Conservation Test aims to statistically evaluate the existence of a monotonically increasing phylotranscriptomic pattern based on TAI or TDI computations. The corresponding p-value quantifies the probability that a given TAI or TDI pattern (or any phylotranscriptomics pattern) does not follow an early conservation like pattern. A p-value < 0.05 indicates that the corresponding phylotranscriptomics pattern does indeed follow an early conservation (low-high-high) shape.

EarlyConservationTest(
  ExpressionSet,
  modules = NULL,
  permutations = 1000,
  lillie.test = FALSE,
  plotHistogram = FALSE,
  runs = 10,
  parallel = FALSE,
  gof.warning = FALSE,
  custom.perm.matrix = NULL
)

`ExpressionSet`	a standard PhyloExpressionSet or DivergenceExpressionSet object.
`modules`	a list storing three elements: early, mid, and late. Each element expects a numeric vector specifying the developmental stages or experiments that correspond to each module. For example, `module` = list(early = 1:2, mid = 3:5, late = 6:7) devides a dataset storing seven developmental stages into 3 modules.
`permutations`	a numeric value specifying the number of permutations to be performed for the `ReductiveHourglassTest`.
`lillie.test`	a boolean value specifying whether the Lilliefors Kolmogorov-Smirnov Test shall be performed to quantify the goodness of fit.
`plotHistogram`	a boolean value specifying whether a Lillifor's Kolmogorov-Smirnov-Test shall be performed to test the goodness of fit of the approximated distribution, as well as additional plots quantifying the significance of the observed phylotranscriptomic pattern.
`runs`	specify the number of runs to be performed for goodness of fit computations, in case `plotHistogram` = `TRUE`. In most cases `runs` = 100 is a reasonable choice. Default is `runs` = 10 (because it takes less computation time for demonstration purposes).
`parallel`	performing `runs` in parallel (takes all cores of your multicore machine).
`gof.warning`	a logical value indicating whether non significant goodness of fit results should be printed as warning. Default is `gof.warning = FALSE`.
`custom.perm.matrix`	a custom `bootMatrix` (permutation matrix) to perform the underlying test statistic. Default is `custom.perm.matrix = NULL`.

The reductive early conservation test is a permutation test based on the following test statistic.

(1) A set of developmental stages is partitioned into three modules - early, mid, and late - based on prior biological knowledge.

(2) The mean TAI or TDI value for each of the three modules T_early, T_mid, and T_late are computed.

(3) The two differences D1 = T_mid - T_early and D2 = T_late - T_early are calculated.

(4) The minimum D_min of D1 and D2 is computed as final test statistic of the reductive hourglass test.

In order to determine the statistical significance of an observed minimum difference D_min the following permutation test was performed. Based on the bootMatrix D_min is calculated from each of the permuted TAI or TDI profiles, approximated by a Gaussian distribution with method of moments estimated parameters returned by fitdist, and the corresponding p-value is computed by pnorm given the estimated parameters of the Gaussian distribution. The goodness of fit for the random vector D_min is statistically quantified by an Lilliefors (Kolmogorov-Smirnov) test for normality.

In case the parameter plotHistogram = TRUE, a multi-plot is generated showing:

(1) A Cullen and Frey skewness-kurtosis plot generated by descdist. This plot illustrates which distributions seem plausible to fit the resulting permutation vector D_min. In the case of the reductive early conservation test a normal distribution seemed plausible.

(2) A histogram of D_min combined with the density plot is plotted. D_min is then fitted by a normal distribution. The corresponding parameters are estimated by moment matching estimation using the fitdist function.

(3) A plot showing the p-values for N independent runs to verify that a specific p-value is biased by a specific permutation order.

(4) A barplot showing the number of cases in which the underlying goodness of fit (returned by Lilliefors (Kolmogorov-Smirnov) test for normality) has shown to be significant (TRUE) or not significant (FALSE). This allows to quantify the permutation bias and their implications on the goodness of fit.

a list object containing the list elements:

p.value : the p-value quantifying the statistical significance (low-high-high pattern) of the given phylotranscriptomics pattern.

std.dev : the standard deviation of the N sampled phylotranscriptomics patterns for each developmental stage S.

lillie.test : a boolean value specifying whether the Lillifors KS-Test returned a p-value > 0.05, which indicates that fitting the permuted scores with a normal distribution seems plausible.

Hajk-Georg Drost

Drost HG et al. (2015) Mol Biol Evol. 32 (5): 1221-1231 doi:10.1093/molbev/msv012

Quint M et al. (2012). A transcriptomic hourglass in plant embryogenesis. Nature (490): 98-101.

Piasecka B, Lichocki P, Moretti S, et al. (2013) The hourglass and the early conservation models co-existing patterns of developmental constraints in vertebrates. PLoS Genet. 9(4): e1003476.

ecScore, bootMatrix, FlatLineTest,ReductiveHourglassTest, ReverseHourglassTest, PlotSignature

data(PhyloExpressionSetExample)

# perform the early conservation test for a PhyloExpressionSet
# here the prior biological knowledge is that stages 1-2 correspond to module 1 = early,
# stages 3-5 to module 2 = mid (phylotypic module), and stages 6-7 correspond to
# module 3 = late
EarlyConservationTest(PhyloExpressionSetExample,
                       modules = list(early = 1:2, mid = 3:5, late = 6:7), 
                       permutations = 1000)


# use your own permutation matrix based on which p-values (EarlyConservationTest)
# shall be computed
custom_perm_matrix <- bootMatrix(PhyloExpressionSetExample,100)

EarlyConservationTest(PhyloExpressionSetExample,
                       modules = list(early = 1:2, mid = 3:5, late = 6:7), 
                       custom.perm.matrix = custom_perm_matrix)