R/motif_enrichment_multi_promotor_seqs.R
motif_enrichment_multi_promotor_seqs.Rd
Compare the number of motifs in a set of non-random versus randomly sampled gene promotor sequences within a set of subject genomes. The resulting values are then used to statistically assess the enrichment of certain motifs in real sequences compared to randomly sampled gene promotor sequences.
motif_enrichment_multi_promotor_seqs(
blast_tbl,
subject_genomes,
annotation_files,
annotation_format = "gff",
test = "fisher",
alternative = "two.sided",
interval_width,
motifs,
max.mismatch = 0,
min.mismatch = 0,
...
)
a blast_table.
a character vector storing the file paths to the subject genomes that shall be used as subject references.
a character vector storing the file paths to the subject annotation files in .gff
format that match the subject genomes.
the annotation format. Options are:
annotation_format = "gff"
test = "fisher"
: Fisher's Exact Test for Count Data (see link[stats]{fisher.test}
for details).
indicates the alternative hypothesis and must be one of "two.sided"
, "greater"
or "less"
. You can specify just the initial letter. Only used in the 2 by 2 case.
total number of sequences that shall be sampled per subject genome.
a character vector storing (case sensitive) motif sequences for which abundance in the sampled sequences shall be assessed.
maximum number of mismatches that are allowed between the sequence motif and the matching region in the sampled sequence.
minimum number of mismatches that are allowed between the sequence motif and the matching region in the sampled sequence.
additional arguments passed to motif_compare
.