Perform Pairwise Difference Test

The Pairwise Difference Test aims to statistically evaluate the pairwise difference in the phylotranscriptomic pattern between two contrasts based on TAI or TDI computations. The corresponding p-value quantifies the probability that a given TAI or TDI pattern (or any phylotranscriptomics pattern) does not differ from the alternative hypothesis (specifying the direction of difference). A p-value < 0.05 indicates that the corresponding phylotranscriptomics pattern does indeed differ.

Usage

PairwiseTest(
  ExpressionSet,
  modules = NULL,
  altHypothesis = NULL,
  permutations = 1000,
  lillie.test = FALSE,
  plotHistogram = FALSE,
  runs = 10,
  parallel = FALSE,
  gof.warning = FALSE,
  custom.perm.matrix = NULL
)

Arguments

ExpressionSet

a standard PhyloExpressionSet or DivergenceExpressionSet object.

modules

a list storing two elements: contrast1 and contrast2. Each element expects a numeric vector specifying the developmental stages or experiments that correspond to each module. For example, module = list(contrast1 = 1:2, contrast2 = 3:7) divides a dataset storing seven developmental stages into 2 modules (contrasts).

altHypothesis

a character string defining the alternative hypothesis (i.e. direction of difference) used to quantify the statistical significance in the present phylotranscriptomics pattern. Possible values can be:

altHypothesis = "greater" : contrast1 > contrast2
altHypothesis = "less" : contrast1 < contrast2

permutations

a numeric value specifying the number of permutations to be performed for the ReductiveHourglassTest.

lillie.test

a boolean value specifying whether the Lilliefors Kolmogorov-Smirnov Test shall be performed to quantify the goodness of fit.

plotHistogram

a boolean value specifying whether a Lillifor's Kolmogorov-Smirnov-Test shall be performed to test the goodness of fit of the approximated distribution, as well as additional plots quantifying the significance of the observed phylotranscriptomic pattern.

runs

specify the number of runs to be performed for goodness of fit computations, in case plotHistogram = TRUE. In most cases runs = 100 is a reasonable choice. Default is runs = 10 (because it takes less computation time for demonstration purposes).

parallel

performing runs in parallel (takes all cores of your multicore machine).

gof.warning

a logical value indicating whether non significant goodness of fit results should be printed as warning. Default is gof.warning = FALSE.

custom.perm.matrix

a custom bootMatrix (permutation matrix) to perform the underlying test statistic. Default is custom.perm.matrix = NULL.

Value

a list object containing the list elements:

p.value : the p-value quantifying the statistical significance of the given phylotranscriptomics pattern.

std.dev : the standard deviation of the N sampled phylotranscriptomics patterns for each developmental stage S.

lillie.test : a boolean value specifying whether the Lillifors KS-Test returned a p-value > 0.05, which indicates that fitting the permuted scores with a normal distribution seems plausible.

Details

The reductive pairwise difference test is a permutation test based on the following test statistic.

(1) A PhyloExpressionSet is partitioned into contrast pairs - contrast1 and contrast2 - based on prior biological knowledge. This prior knowledge could include sexual, ecological and genetic backgrounds.

(2) The mean TAI or TDI value for each of the two contrasts contrast1 and contrast2 are computed.

(3) The pairwise differences D_contrast = contrast1 - contrast2 is calculated as final test statistic of the pairwise test, when altHypothesis is specified as "greater". When altHypothesis is specified as "less", sign of D_contrast is reversed.

In order to determine the statistical significance of an observed pairwise difference D_contrast the following permutation test was performed. Based on the bootMatrix D_contrast is calculated from each of the permuted TAI or TDI profiles, approximated by a Gaussian distribution with method of moments estimated parameters returned by fitdist, and the corresponding p-value is computed by pnorm given the estimated parameters of the Gaussian distribution. The goodness of fit for the random vector D_contrast is statistically quantified by an Lilliefors (Kolmogorov-Smirnov) test for normality.

In case the parameter plotHistogram = TRUE, a multi-plot is generated showing:

(1) A Cullen and Frey skewness-kurtosis plot generated by descdist. This plot illustrates which distributions seem plausible to fit the resulting permutation vector D_contrast In the case of the pairwise difference test a normal distribution seemed plausible.

(2) A histogram of D_contrast combined with the density plot is plotted. D_contrast is then fitted by a normal distribution. The corresponding parameters are estimated by moment matching estimation using the fitdist function.

(3) A plot showing the p-values for N independent runs to verify that a specific p-value is biased by a specific permutation order.

(4) A barplot showing the number of cases in which the underlying goodness of fit (returned by Lilliefors (Kolmogorov-Smirnov) test for normality) has shown to be significant (TRUE) or not significant (FALSE). This allows to quantify the permutation bias and their implications on the goodness of fit.

References

Drost HG et al. (2015) Mol Biol Evol. 32 (5): 1221-1231 doi:10.1093/molbev/msv012

Lotharukpong JS et al. (2024) (unpublished)

Author

Hajk-Georg Drost and Jaruwatana Sodai Lotharukpong

Examples


data(PhyloExpressionSetExample)

# perform the pairwise difference test for a PhyloExpressionSet
# here the prior biological knowledge is that stages 1-2 correspond to contrast1,
# stages 3-7 correspond to contrast 2.
# We test whether TAI in contrast1 is greater than contrast 2.
PairwiseTest(PhyloExpressionSetExample,
         modules = list(contrast1 = 1:2, contrast2 = 3:7), 
         altHypothesis = "greater",
         permutations = 1000)
#> 
#> [ Number of Eigen threads that are employed on your machine: 12 ]
#> 
#> [ Computing age assignment permutations for test statistic ... ]
#> 
[=========================================] 100%   
#> [ Computing variances of permuted transcriptome signatures ... ]
#> 
#> $p.value
#> [1] 0.02642072
#> 
#> $std.dev
#> [1] 0.05507531 0.05456170 0.05356079 0.05198272 0.05115506 0.05237187 0.05466845
#> 
#> $lillie.test
#> [1] NA
#> 

# We can also test whether TAI in contrast1 is less than contrast 2.
PairwiseTest(PhyloExpressionSetExample,
         modules = list(contrast1 = 1:2, contrast2 = 3:7), 
         altHypothesis = "less",
         permutations = 1000)
#> 
#> [ Number of Eigen threads that are employed on your machine: 12 ]
#> 
#> [ Computing age assignment permutations for test statistic ... ]
#> 
[=========================================] 100%   
#> [ Computing variances of permuted transcriptome signatures ... ]
#> 
#> $p.value
#> [1] 0.9710124
#> 
#> $std.dev
#> [1] 0.05452892 0.05329469 0.05190185 0.05093905 0.05077639 0.05335412 0.05782893
#> 
#> $lillie.test
#> [1] NA
#> 

# if we only want to test whether TAI in stage 1 (contrast 1) is greater than stage 3 (contrast 2).
PairwiseTest(PhyloExpressionSetExample,
         modules = list(contrast1 = 1, contrast2 = 2), 
         altHypothesis = "greater",
         permutations = 1000)
#> 
#> [ Number of Eigen threads that are employed on your machine: 12 ]
#> 
#> [ Computing age assignment permutations for test statistic ... ]
#> 
[=========================================] 100%   
#> [ Computing variances of permuted transcriptome signatures ... ]
#> 
#> $p.value
#> [1] 0.3954234
#> 
#> $std.dev
#> [1] 0.05507311 0.05438198 0.05413158 0.05251364 0.05146001 0.05420949 0.05798046
#> 
#> $lillie.test
#> [1] NA
#> 

# use your own permutation matrix based on which p-values (PairwiseTest)
# shall be computed
custom_perm_matrix <- bootMatrix(PhyloExpressionSetExample,100)
#> 
#> [ Number of Eigen threads that are employed on your machine: 12 ]
#> 
#> [ Computing age assignment permutations for test statistic ... ]
#> 
[=========================================] 100%   
#> [ Computing variances of permuted transcriptome signatures ... ]
#> 
# We test whether TAI in contrast1 is greater than contrast 2.
PairwiseTest(PhyloExpressionSetExample,
         modules = list(contrast1 = 1:2, contrast2 = 3:7), 
         altHypothesis = "greater",
         custom.perm.matrix = custom_perm_matrix)
#> $p.value
#> [1] 0.02267767
#> 
#> $std.dev
#>     Zygote   Quadrant   Globular      Heart    Torpedo       Bent     Mature 
#> 0.05300417 0.05170644 0.04927963 0.04600548 0.04437752 0.04877053 0.04932402 
#> 
#> $lillie.test
#> [1] NA
#>