The *Reverse Hourglass Test* aims to statistically evaluate the
existence of a reverse hourglass pattern based on `TAI`

or `TDI`

computations.
The corresponding p-value quantifies the probability that a given TAI or TDI pattern (or any phylotranscriptomics pattern)
does follow an hourglass like shape. A p-value < 0.05 indicates that the corresponding phylotranscriptomics pattern does
rather follow a reverse hourglass (low-high-low) shape.

ReverseHourglassTest( ExpressionSet, modules = NULL, permutations = 1000, lillie.test = FALSE, plotHistogram = FALSE, runs = 10, parallel = FALSE, gof.warning = FALSE, custom.perm.matrix = NULL )

ExpressionSet | a standard PhyloExpressionSet or DivergenceExpressionSet object. |
---|---|

modules | a list storing three elements: early, mid, and late. Each element expects a numeric
vector specifying the developmental stages or experiments that correspond to each module.
For example, |

permutations | a numeric value specifying the number of permutations to be performed for the |

lillie.test | a boolean value specifying whether the Lilliefors Kolmogorov-Smirnov Test shall be performed to quantify the goodness of fit. |

plotHistogram | a boolean value specifying whether a |

runs | specify the number of runs to be performed for goodness of fit computations, in case |

parallel | performing |

gof.warning | a logical value indicating whether non significant goodness of fit results should be printed as warning. Default is |

custom.perm.matrix | a custom |

a list object containing the list elements:

`p.value`

: the p-value quantifying the statistical significance (low-high-low pattern) of the given phylotranscriptomics pattern.

`std.dev`

: the standard deviation of the N sampled phylotranscriptomics patterns for each developmental stage S.

`lillie.test`

: a boolean value specifying whether the *Lillifors KS-Test* returned a p-value > 0.05,
which indicates that fitting the permuted scores with a normal distribution seems plausible.

The reverse hourglass test is a permutation test based on the following test statistic.

(1) A set of developmental stages is partitioned into three modules - early, mid, and late - based on prior biological knowledge.

(2) The mean `TAI`

or `TDI`

value for each of the three modules T_early, T_mid, and T_late are computed.

(3) The two differences `D1 = T_mid - T_early`

and `D2 = T_mid - T_late`

are calculated.

(4) The minimum `D_min`

of `D1`

and `D2`

is computed as final test statistic of the reductive hourglass test.

In order to determine the statistical significance of an observed minimum difference D_min
the following permutation test was performed. Based on the `bootMatrix`

D_min
is calculated from each of the permuted `TAI`

or `TDI`

profiles,
approximated by a Gaussian distribution with method of moments estimated parameters returned by `fitdist`

,
and the corresponding p-value is computed by `pnorm`

given the estimated parameters of the Gaussian distribution.
The *goodness of fit* for the random vector *D_min* is statistically quantified by an Lilliefors (Kolmogorov-Smirnov) test for normality.

In case the parameter *plotHistogram = TRUE*, a multi-plot is generated showing:

(1) A Cullen and Frey skewness-kurtosis plot generated by `descdist`

.
This plot illustrates which distributions seem plausible to fit the resulting permutation vector D_min.
In the case of the *Reverse Hourglass Test* a normal distribution seemed plausible.

(2) A histogram of `D_min`

combined with the density plot is plotted. D_min is then fitted by a normal distribution.
The corresponding parameters are estimated by *moment matching estimation* using the `fitdist`

function.

(3) A plot showing the p-values for N independent runs to verify that a specific p-value is biased by a specific permutation order.

(4) A barplot showing the number of cases in which the underlying goodness of fit (returned by Lilliefors (Kolmogorov-Smirnov) test
for normality) has shown to be significant (`TRUE`

) or not significant (`FALSE`

).
This allows to quantify the permutation bias and their implications on the goodness of fit.

Drost HG et al. (2015) Mol Biol Evol. 32 (5): 1221-1231 doi:10.1093/molbev/msv012

Quint M et al. (2012). A transcriptomic hourglass in plant embryogenesis. Nature (490): 98-101.

M. L. Delignette-Muller, R. Pouillot, J.-B. Denis and C. Dutang (2014), fitdistrplus: help to fit of a parametric distribution to non-censored or censored data.

Cullen AC and Frey HC (1999) Probabilistic techniques in exposure assessment. Plenum Press, USA, pp. 81-159.

Evans M, Hastings N and Peacock B (2000) Statistical distributions. John Wiley and Sons Inc.

Sokal RR and Rohlf FJ (1995) Biometry. W.H. Freeman and Company, USA, pp. 111-115.

Juergen Gross and bug fixes by Uwe Ligges (2012). nortest: Tests for Normality. R package version 1.0-2.

http://CRAN.R-project.org/package=nortest

Dallal, G.E. and Wilkinson, L. (1986): An analytic approximation to the distribution of Lilliefors' test for normality. The American Statistician, 40, 294-296.

Stephens, M.A. (1974): EDF statistics for goodness of fit and some comparisons. Journal of the American Statistical Association, 69, 730-737.

http://stackoverflow.com/questions/4290081/fitting-data-to-distributions?rq=1

http://stats.stackexchange.com/questions/45033/can-i-use-kolmogorov-smirnov-test-and-estimate-distribution-parameters

http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf

http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf

Hajk-Georg Drost

data(PhyloExpressionSetExample) # perform the reductive hourglass test for a PhyloExpressionSet # here the prior biological knowledge is that stages 1-2 correspond to module 1 = early, # stages 3-5 to module 2 = mid (phylotypic module), and stages 6-7 correspond to # module 3 = late ReverseHourglassTest(PhyloExpressionSetExample, modules = list(early = 1:2, mid = 3:5, late = 6:7), permutations = 1000) #> $p.value #> [1] 1 #> #> $std.dev #> [1] 0.05538429 0.05429991 0.05291710 0.05192147 0.05121974 0.05251223 0.05576605 #> #> $lillie.test #> [1] NA #> # use your own permutation matrix based on which p-values (ReverseHourglassTest) # shall be computed custom_perm_matrix <- bootMatrix(PhyloExpressionSetExample,100) ReverseHourglassTest(PhyloExpressionSetExample, modules = list(early = 1:2, mid = 3:5, late = 6:7), custom.perm.matrix = custom_perm_matrix) #> $p.value #> [1] 1 #> #> $std.dev #> Zygote Quadrant Globular Heart Torpedo Bent Mature #> 0.05778972 0.05414917 0.05495788 0.05454285 0.05083340 0.05444174 0.06293317 #> #> $lillie.test #> [1] NA #>