Abstract S7 base class for storing and manipulating phylotranscriptomic expression data. This class provides the common interface for both bulk and single-cell phylotranscriptomic data.
Usage
PhyloExpressionSetBase(
strata = stop("@strata is required"),
strata_values = stop("@strata_values is required"),
expression = stop("@expression is required"),
groups = stop("@groups is required"),
name = "Phylo Expression Set",
species = character(0),
index_type = "TXI",
identities_label = "Identities",
null_conservation_sample_size = 5000L,
.null_conservation_txis = NULL
)
Arguments
- strata
Factor vector of phylostratum assignments for each gene
- strata_values
Numeric vector of phylostratum values used in TXI calculations
- expression
Matrix of expression counts with genes as rows and samples as columns
- groups
Factor vector indicating which identity each sample belongs to
- name
Character string naming the dataset (default: "Phylo Expression Set")
- species
Character string specifying the species (default: NULL)
- index_type
Character string specifying the transcriptomic index type (default: "TXI")
- identities_label
Character string labeling the identities (default: "Identities")
- null_conservation_sample_size
Numeric value for null conservation sample size (default: 5000)
- .null_conservation_txis
Precomputed null conservation TXI values (default: NULL)
Details
The PhyloExpressionSetBase class serves as the foundation for phylotranscriptomic analysis, providing shared functionality for both bulk and single-cell data types.
Abstract Properties:
Subclasses must implement the expression_collapsed
property to define how expression
data should be collapsed across replicates or cells.
Computed Properties: Several properties are computed automatically when accessed:
gene_ids
- Character vector of gene identifiers (rownames of expression matrix)identities
- Character vector of identity labels (colnames of collapsed expression)sample_names
- Character vector of sample names (colnames of expression matrix)num_identities
- Integer count of unique identitiesnum_samples
- Integer count of total samplesnum_genes
- Integer count of genesnum_strata
- Integer count of phylostrataindex_full_name
- Full name of the transcriptomic index typegroup_map
- List mapping identity names to sample namesTXI
- Numeric vector of TXI values for each identity (computed from collapsed expression)TXI_sample
- Numeric vector of TXI values for each sample (computed from raw expression)null_conservation_txis
- Matrix of null conservation TXI values for statistical testing
Validation: The class ensures consistency between expression data, phylostratum assignments, and groupings. All gene-level vectors must have matching lengths, and sample groupings must be consistent.