R/MatchMap.R
MatchMap.Rd
This function matches a Phylostratigraphic Map or Divergence Map only storing unique gene ids with a ExpressionMatrix also storing only unique gene ids.
MatchMap(Map, ExpressionMatrix, remove.duplicates = FALSE, accumulate = NULL)
Map | a standard Phylostratigraphic Map or Divergence Map object. |
---|---|
ExpressionMatrix | a standard ExpressionMatrix object. |
remove.duplicates | a logical value indicating whether duplicate gene ids should be removed from the data set. |
accumulate | an accumulation function such as |
a standard PhyloExpressionSet or DivergenceExpressionSet object.
In phylotranscriptomics analyses two major techniques are performed to obtain standard Phylostratigraphic map or Divergence map objects.
To obtain a Phylostratigraphic Map, Phylostratigraphy (Domazet-Loso et al., 2007) has to be performed. To obtain a Divergence Map, orthologous gene detection, Ka/Ks computations, and decilation (Quint et al., 2012; Drost et al., 2015) have to be performed.
The resulting standard Phylostratigraphic Map or Divergence Map objects consist of 2 colums storing the phylostratum assignment or divergence stratum assignment of a given gene in column one, and the corresponding gene id of that gene on columns two.
A standard ExpressionMatrix is a common gene expression matrix storing the gene ids or probe ids in the first column, and all experiments/stages/replicates in the following columns.
The MatchMap function takes both standard datasets: Map and ExpressionMatrix to merge both data sets to obtain a standard PhyloExpressionSet or DivergenceExpressionSet object.
This procedure is analogous to merge
, but is customized to the Phylostratigraphic Map, Divergence Map, and ExpressionMatrix standards to allow a faster and more intuitive usage.
In case you work with an ExpressionMatrix that stores multiple expression levels for a unique gene id, you
can specify the accumulation
argument to accumulate these multiple expression levels to obtain
one expression level for one unique gene.
Domazet-Loso T, Brajkovic J, Tautz D (2007) A phylostratigraphy approach to uncover the genomic history of major adaptations in metazoan lineages. Trends Genet. 23: 533-9.
Domazet-Loso T, Tautz D (2010) A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns. Nature 468: 815-8.
Quint M., Drost H.G., Gabel A., Ullrich K.K., Boenn M., Grosse I. (2012) A transcriptomic hourglass in plant embryogenesis. Nature 490: 98-101.
Drost HG et al. (2015) Mol Biol Evol. 32 (5): 1221-1231 doi:10.1093/molbev/msv012
Hajk-Georg Drost
# load a standard PhyloExpressionSet data(PhyloExpressionSetExample) # in a standard PhyloExpressionSet, # column one and column two denote a standard # phylostratigraphic map PhyloMap <- PhyloExpressionSetExample[ , 1:2] # look at the phylostratigraphic map standard head(PhyloMap) #> Phylostratum GeneID #> 1 1 at1g01040.2 #> 2 1 at1g01050.1 #> 3 1 at1g01070.1 #> 4 1 at1g01080.2 #> 5 1 at1g01090.1 #> 6 1 at1g01120.1 # in a standard PhyloExpressionSet, column two combined # with column 3 - N denote a standard ExpressionMatrix ExpressionMatrixExample <- PhyloExpressionSetExample[ , c(2,3:9)] # these two data sets shall illustrate an example # phylostratigraphic map that is returned # by a standard phylostratigraphy run, and a expression set # that is the result of expression data analysis # (background correction, normalization, ...) # now we can use the MatchMap function to merge both data sets # to obtain a standard PhyloExpressionSet PES <- MatchMap(PhyloMap, ExpressionMatrixExample) # note that the function returns a head() # of the matched gene ids to enable # the user to find potential mis-matches # the entire procedure is analogous to merge() # with two data sets sharing the same gene ids # as column (primary key) PES_merge <- merge(PhyloMap, ExpressionMatrixExample)