Match a Phylostratigraphic Map or Divergence Map with a ExpressionMatrix

This function matches a Phylostratigraphic Map or Divergence Map only storing unique gene ids with a ExpressionMatrix also storing only unique gene ids.

Usage

MatchMap(Map, ExpressionMatrix, remove.duplicates = FALSE, accumulate = NULL)

Arguments

Map: a standard Phylostratigraphic Map or Divergence Map object.
ExpressionMatrix: a standard ExpressionMatrix object.
remove.duplicates: a logical value indicating whether duplicate gene ids should be removed from the data set.
accumulate: an accumulation function such as mean(), median(), or min() to accumulate multiple expression levels that map to the same unique gene id present in the ExpressionMatrix.

Value

a standard PhyloExpressionSet or DivergenceExpressionSet object.

Details

In phylotranscriptomics analyses two major techniques are performed to obtain standard Phylostratigraphic map or Divergence map objects.

To obtain a Phylostratigraphic Map, Phylostratigraphy (Domazet-Loso et al., 2007) has to be performed. To obtain a Divergence Map, orthologous gene detection, Ka/Ks computations, and decilation (Quint et al., 2012; Drost et al., 2015) have to be performed.

The resulting standard Phylostratigraphic Map or Divergence Map objects consist of 2 colums storing the phylostratum assignment or divergence stratum assignment of a given gene in column one, and the corresponding gene id of that gene on columns two.

A standard ExpressionMatrix is a common gene expression matrix storing the gene ids or probe ids in the first column, and all experiments/stages/replicates in the following columns.

The MatchMap function takes both standard datasets: Map and ExpressionMatrix to merge both data sets to obtain a standard PhyloExpressionSet or DivergenceExpressionSet object.

This procedure is analogous to merge, but is customized to the Phylostratigraphic Map, Divergence Map, and ExpressionMatrix standards to allow a faster and more intuitive usage.

In case you work with an ExpressionMatrix that stores multiple expression levels for a unique gene id, you can specify the accumulation argument to accumulate these multiple expression levels to obtain one expression level for one unique gene.

References

Domazet-Loso T, Brajkovic J, Tautz D (2007) A phylostratigraphy approach to uncover the genomic history of major adaptations in metazoan lineages. Trends Genet. 23: 533-9.

Domazet-Loso T, Tautz D (2010) A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns. Nature 468: 815-8.

Quint M., Drost H.G., Gabel A., Ullrich K.K., Boenn M., Grosse I. (2012) A transcriptomic hourglass in plant embryogenesis. Nature 490: 98-101.

Drost HG et al. (2015) Mol Biol Evol. 32 (5): 1221-1231 doi:10.1093/molbev/msv012

Author

Hajk-Georg Drost

Examples

        
# load a standard PhyloExpressionSet
data(PhyloExpressionSetExample)
        
# in a standard PhyloExpressionSet, 
# column one and column two denote a standard 
# phylostratigraphic map
PhyloMap <- PhyloExpressionSetExample[ , 1:2]
        
# look at the phylostratigraphic map standard
head(PhyloMap)
#>   Phylostratum      GeneID
#> 1            1 at1g01040.2
#> 2            1 at1g01050.1
#> 3            1 at1g01070.1
#> 4            1 at1g01080.2
#> 5            1 at1g01090.1
#> 6            1 at1g01120.1
        
# in a standard PhyloExpressionSet, column two combined 
# with column 3 - N denote a standard ExpressionMatrix
ExpressionMatrixExample <- PhyloExpressionSetExample[ , c(2,3:9)]
        
# these two data sets shall illustrate an example 
# phylostratigraphic map that is returned
# by a standard phylostratigraphy run, and a expression set 
# that is the result of expression data analysis 
# (background correction, normalization, ...)
        
# now we can use the MatchMap function to merge both data sets
# to obtain a standard PhyloExpressionSet
        
PES <- MatchMap(PhyloMap, ExpressionMatrixExample)
        
# note that the function returns a head() 
# of the matched gene ids to enable
# the user to find potential mis-matches
        
# the entire procedure is analogous to merge() 
# with two data sets sharing the same gene ids 
# as column (primary key)
PES_merge <- merge(PhyloMap, ExpressionMatrixExample)