The Pairwise Difference Test aims to statistically evaluate the pairwise
difference in the phylotranscriptomic pattern between two contrasts based on TAI
or TDI
computations.
The corresponding p-value quantifies the probability that a given TAI or TDI pattern (or any phylotranscriptomics pattern)
does not differ from the alternative hypothesis (specifying the direction of difference).
A p-value < 0.05 indicates that the corresponding phylotranscriptomics pattern does
indeed differ.
Usage
PairwiseTest(
ExpressionSet,
modules = NULL,
altHypothesis = NULL,
permutations = 1000,
lillie.test = FALSE,
plotHistogram = FALSE,
runs = 10,
parallel = FALSE,
gof.warning = FALSE,
custom.perm.matrix = NULL
)
Arguments
- ExpressionSet
a standard PhyloExpressionSet or DivergenceExpressionSet object.
- modules
a list storing two elements: contrast1 and contrast2. Each element expects a numeric vector specifying the developmental stages or experiments that correspond to each module. For example,
module
= list(contrast1 = 1:2, contrast2 = 3:7) divides a dataset storing seven developmental stages into 2 modules (contrasts).- altHypothesis
a character string defining the alternative hypothesis (i.e. direction of difference) used to quantify the statistical significance in the present phylotranscriptomics pattern. Possible values can be:
altHypothesis
="greater"
: contrast1 > contrast2altHypothesis
="less"
: contrast1 < contrast2
- permutations
a numeric value specifying the number of permutations to be performed for the
ReductiveHourglassTest
.- lillie.test
a boolean value specifying whether the Lilliefors Kolmogorov-Smirnov Test shall be performed to quantify the goodness of fit.
- plotHistogram
a boolean value specifying whether a Lillifor's Kolmogorov-Smirnov-Test shall be performed to test the goodness of fit of the approximated distribution, as well as additional plots quantifying the significance of the observed phylotranscriptomic pattern.
- runs
specify the number of runs to be performed for goodness of fit computations, in case
plotHistogram
=TRUE
. In most casesruns
= 100 is a reasonable choice. Default isruns
= 10 (because it takes less computation time for demonstration purposes).- parallel
performing
runs
in parallel (takes all cores of your multicore machine).- gof.warning
a logical value indicating whether non significant goodness of fit results should be printed as warning. Default is
gof.warning = FALSE
.- custom.perm.matrix
a custom
bootMatrix
(permutation matrix) to perform the underlying test statistic. Default iscustom.perm.matrix = NULL
.
Value
a list object containing the list elements:
p.value
: the p-value quantifying the statistical significance of the given phylotranscriptomics pattern.
std.dev
: the standard deviation of the N sampled phylotranscriptomics patterns for each developmental stage S.
lillie.test
: a boolean value specifying whether the Lillifors KS-Test returned a p-value > 0.05,
which indicates that fitting the permuted scores with a normal distribution seems plausible.
Details
The reductive pairwise difference test is a permutation test based on the following test statistic.
(1) A PhyloExpressionSet is partitioned into contrast pairs - contrast1 and contrast2 - based on prior biological knowledge. This prior knowledge could include sexual, ecological and genetic backgrounds.
(2) The mean TAI
or TDI
value for each of the two contrasts contrast1 and contrast2 are computed.
(3) The pairwise differences D_contrast = contrast1 - contrast2 is calculated as final test statistic of the pairwise test,
when altHypothesis
is specified as "greater". When altHypothesis
is specified as "less", sign of D_contrast is reversed.
In order to determine the statistical significance of an observed pairwise difference D_contrast
the following permutation test was performed. Based on the bootMatrix
D_contrast
is calculated from each of the permuted TAI
or TDI
profiles,
approximated by a Gaussian distribution with method of moments estimated parameters returned by fitdist
,
and the corresponding p-value is computed by pnorm
given the estimated parameters of the Gaussian distribution.
The goodness of fit for the random vector D_contrast is statistically quantified by an Lilliefors (Kolmogorov-Smirnov) test
for normality.
In case the parameter plotHistogram = TRUE, a multi-plot is generated showing:
(1) A Cullen and Frey skewness-kurtosis plot generated by descdist
.
This plot illustrates which distributions seem plausible to fit the resulting permutation vector D_contrast
In the case of the pairwise difference test a normal distribution seemed plausible.
(2) A histogram of D_contrast combined with the density plot is plotted. D_contrast is then fitted by a normal distribution.
The corresponding parameters are estimated by moment matching estimation using the fitdist
function.
(3) A plot showing the p-values for N independent runs to verify that a specific p-value is biased by a specific permutation order.
(4) A barplot showing the number of cases in which the underlying goodness of fit (returned by Lilliefors (Kolmogorov-Smirnov) test
for normality) has shown to be significant (TRUE
) or not significant (FALSE
).
This allows to quantify the permutation bias and their implications on the goodness of fit.
References
Drost HG et al. (2015) Mol Biol Evol. 32 (5): 1221-1231 doi:10.1093/molbev/msv012
Lotharukpong JS et al. (2024) (unpublished)
Examples
data(PhyloExpressionSetExample)
# perform the pairwise difference test for a PhyloExpressionSet
# here the prior biological knowledge is that stages 1-2 correspond to contrast1,
# stages 3-7 correspond to contrast 2.
# We test whether TAI in contrast1 is greater than contrast 2.
PairwiseTest(PhyloExpressionSetExample,
modules = list(contrast1 = 1:2, contrast2 = 3:7),
altHypothesis = "greater",
permutations = 1000)
#>
#> [ Number of Eigen threads that are employed on your machine: 12 ]
#>
#> [ Computing age assignment permutations for test statistic ... ]
#>
[=========================================] 100%
#> [ Computing variances of permuted transcriptome signatures ... ]
#>
#> $p.value
#> [1] 0.02642072
#>
#> $std.dev
#> [1] 0.05507531 0.05456170 0.05356079 0.05198272 0.05115506 0.05237187 0.05466845
#>
#> $lillie.test
#> [1] NA
#>
# We can also test whether TAI in contrast1 is less than contrast 2.
PairwiseTest(PhyloExpressionSetExample,
modules = list(contrast1 = 1:2, contrast2 = 3:7),
altHypothesis = "less",
permutations = 1000)
#>
#> [ Number of Eigen threads that are employed on your machine: 12 ]
#>
#> [ Computing age assignment permutations for test statistic ... ]
#>
[=========================================] 100%
#> [ Computing variances of permuted transcriptome signatures ... ]
#>
#> $p.value
#> [1] 0.9710124
#>
#> $std.dev
#> [1] 0.05452892 0.05329469 0.05190185 0.05093905 0.05077639 0.05335412 0.05782893
#>
#> $lillie.test
#> [1] NA
#>
# if we only want to test whether TAI in stage 1 (contrast 1) is greater than stage 3 (contrast 2).
PairwiseTest(PhyloExpressionSetExample,
modules = list(contrast1 = 1, contrast2 = 2),
altHypothesis = "greater",
permutations = 1000)
#>
#> [ Number of Eigen threads that are employed on your machine: 12 ]
#>
#> [ Computing age assignment permutations for test statistic ... ]
#>
[=========================================] 100%
#> [ Computing variances of permuted transcriptome signatures ... ]
#>
#> $p.value
#> [1] 0.3954234
#>
#> $std.dev
#> [1] 0.05507311 0.05438198 0.05413158 0.05251364 0.05146001 0.05420949 0.05798046
#>
#> $lillie.test
#> [1] NA
#>
# use your own permutation matrix based on which p-values (PairwiseTest)
# shall be computed
custom_perm_matrix <- bootMatrix(PhyloExpressionSetExample,100)
#>
#> [ Number of Eigen threads that are employed on your machine: 12 ]
#>
#> [ Computing age assignment permutations for test statistic ... ]
#>
[=========================================] 100%
#> [ Computing variances of permuted transcriptome signatures ... ]
#>
# We test whether TAI in contrast1 is greater than contrast 2.
PairwiseTest(PhyloExpressionSetExample,
modules = list(contrast1 = 1:2, contrast2 = 3:7),
altHypothesis = "greater",
custom.perm.matrix = custom_perm_matrix)
#> $p.value
#> [1] 0.02267767
#>
#> $std.dev
#> Zygote Quadrant Globular Heart Torpedo Bent Mature
#> 0.05300417 0.05170644 0.04927963 0.04600548 0.04437752 0.04877053 0.04932402
#>
#> $lillie.test
#> [1] NA
#>