Match a Phylostratigraphic Map or Divergence Map with a ExpressionMatrix
Source:R/MatchMap.R
MatchMap.Rd
This function matches a Phylostratigraphic Map or Divergence Map only storing unique gene ids with a ExpressionMatrix also storing only unique gene ids.
Arguments
- Map
a standard Phylostratigraphic Map or Divergence Map object.
- ExpressionMatrix
a standard ExpressionMatrix object.
- remove.duplicates
a logical value indicating whether duplicate gene ids should be removed from the data set.
- accumulate
an accumulation function such as
mean()
,median()
, ormin()
to accumulate multiple expression levels that map to the same unique gene id present in theExpressionMatrix
.
Details
In phylotranscriptomics analyses two major techniques are performed to obtain standard Phylostratigraphic map or Divergence map objects.
To obtain a Phylostratigraphic Map, Phylostratigraphy (Domazet-Loso et al., 2007) has to be performed. To obtain a Divergence Map, orthologous gene detection, Ka/Ks computations, and decilation (Quint et al., 2012; Drost et al., 2015) have to be performed.
The resulting standard Phylostratigraphic Map or Divergence Map objects consist of 2 colums storing the phylostratum assignment or divergence stratum assignment of a given gene in column one, and the corresponding gene id of that gene on columns two.
A standard ExpressionMatrix is a common gene expression matrix storing the gene ids or probe ids in the first column, and all experiments/stages/replicates in the following columns.
The MatchMap function takes both standard datasets: Map and ExpressionMatrix to merge both data sets to obtain a standard PhyloExpressionSet or DivergenceExpressionSet object.
This procedure is analogous to merge
, but is customized to the Phylostratigraphic Map, Divergence Map, and ExpressionMatrix standards to allow a faster and more intuitive usage.
In case you work with an ExpressionMatrix that stores multiple expression levels for a unique gene id, you
can specify the accumulation
argument to accumulate these multiple expression levels to obtain
one expression level for one unique gene.
References
Domazet-Loso T, Brajkovic J, Tautz D (2007) A phylostratigraphy approach to uncover the genomic history of major adaptations in metazoan lineages. Trends Genet. 23: 533-9.
Domazet-Loso T, Tautz D (2010) A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns. Nature 468: 815-8.
Quint M., Drost H.G., Gabel A., Ullrich K.K., Boenn M., Grosse I. (2012) A transcriptomic hourglass in plant embryogenesis. Nature 490: 98-101.
Drost HG et al. (2015) Mol Biol Evol. 32 (5): 1221-1231 doi:10.1093/molbev/msv012
Examples
# load a standard PhyloExpressionSet
data(PhyloExpressionSetExample)
# in a standard PhyloExpressionSet,
# column one and column two denote a standard
# phylostratigraphic map
PhyloMap <- PhyloExpressionSetExample[ , 1:2]
# look at the phylostratigraphic map standard
head(PhyloMap)
#> Phylostratum GeneID
#> 1 1 at1g01040.2
#> 2 1 at1g01050.1
#> 3 1 at1g01070.1
#> 4 1 at1g01080.2
#> 5 1 at1g01090.1
#> 6 1 at1g01120.1
# in a standard PhyloExpressionSet, column two combined
# with column 3 - N denote a standard ExpressionMatrix
ExpressionMatrixExample <- PhyloExpressionSetExample[ , c(2,3:9)]
# these two data sets shall illustrate an example
# phylostratigraphic map that is returned
# by a standard phylostratigraphy run, and a expression set
# that is the result of expression data analysis
# (background correction, normalization, ...)
# now we can use the MatchMap function to merge both data sets
# to obtain a standard PhyloExpressionSet
PES <- MatchMap(PhyloMap, ExpressionMatrixExample)
# note that the function returns a head()
# of the matched gene ids to enable
# the user to find potential mis-matches
# the entire procedure is analogous to merge()
# with two data sets sharing the same gene ids
# as column (primary key)
PES_merge <- merge(PhyloMap, ExpressionMatrixExample)