This function matches a Phylostratigraphic Map or Divergence Map only storing unique gene ids with a ExpressionMatrix also storing only unique gene ids.

MatchMap(Map, ExpressionMatrix, remove.duplicates = FALSE, accumulate = NULL)



a standard Phylostratigraphic Map or Divergence Map object.


a standard ExpressionMatrix object.


a logical value indicating whether duplicate gene ids should be removed from the data set.


an accumulation function such as mean(), median(), or min() to accumulate multiple expression levels that map to the same unique gene id present in the ExpressionMatrix.


a standard PhyloExpressionSet or DivergenceExpressionSet object.


In phylotranscriptomics analyses two major techniques are performed to obtain standard Phylostratigraphic map or Divergence map objects.

To obtain a Phylostratigraphic Map, Phylostratigraphy (Domazet-Loso et al., 2007) has to be performed. To obtain a Divergence Map, orthologous gene detection, Ka/Ks computations, and decilation (Quint et al., 2012; Drost et al., 2015) have to be performed.

The resulting standard Phylostratigraphic Map or Divergence Map objects consist of 2 colums storing the phylostratum assignment or divergence stratum assignment of a given gene in column one, and the corresponding gene id of that gene on columns two.

A standard ExpressionMatrix is a common gene expression matrix storing the gene ids or probe ids in the first column, and all experiments/stages/replicates in the following columns.

The MatchMap function takes both standard datasets: Map and ExpressionMatrix to merge both data sets to obtain a standard PhyloExpressionSet or DivergenceExpressionSet object.

This procedure is analogous to merge, but is customized to the Phylostratigraphic Map, Divergence Map, and ExpressionMatrix standards to allow a faster and more intuitive usage.

In case you work with an ExpressionMatrix that stores multiple expression levels for a unique gene id, you can specify the accumulation argument to accumulate these multiple expression levels to obtain one expression level for one unique gene.


Domazet-Loso T, Brajkovic J, Tautz D (2007) A phylostratigraphy approach to uncover the genomic history of major adaptations in metazoan lineages. Trends Genet. 23: 533-9.

Domazet-Loso T, Tautz D (2010) A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns. Nature 468: 815-8.

Quint M., Drost H.G., Gabel A., Ullrich K.K., Boenn M., Grosse I. (2012) A transcriptomic hourglass in plant embryogenesis. Nature 490: 98-101.

Drost HG et al. (2015) Mol Biol Evol. 32 (5): 1221-1231 doi:10.1093/molbev/msv012


Hajk-Georg Drost


# load a standard PhyloExpressionSet
# in a standard PhyloExpressionSet, 
# column one and column two denote a standard 
# phylostratigraphic map
PhyloMap <- PhyloExpressionSetExample[ , 1:2]
# look at the phylostratigraphic map standard
#>   Phylostratum      GeneID
#> 1            1 at1g01040.2
#> 2            1 at1g01050.1
#> 3            1 at1g01070.1
#> 4            1 at1g01080.2
#> 5            1 at1g01090.1
#> 6            1 at1g01120.1
# in a standard PhyloExpressionSet, column two combined 
# with column 3 - N denote a standard ExpressionMatrix
ExpressionMatrixExample <- PhyloExpressionSetExample[ , c(2,3:9)]
# these two data sets shall illustrate an example 
# phylostratigraphic map that is returned
# by a standard phylostratigraphy run, and a expression set 
# that is the result of expression data analysis 
# (background correction, normalization, ...)
# now we can use the MatchMap function to merge both data sets
# to obtain a standard PhyloExpressionSet
PES <- MatchMap(PhyloMap, ExpressionMatrixExample)
# note that the function returns a head() 
# of the matched gene ids to enable
# the user to find potential mis-matches
# the entire procedure is analogous to merge() 
# with two data sets sharing the same gene ids 
# as column (primary key)
PES_merge <- merge(PhyloMap, ExpressionMatrixExample)