Compare the number of motifs in a set of non-random versus random sequences genomic sequences within a set of subject genomes. The resulting values are then used to statistically assess the enrichment of certain motifs in real sequences compared to randomly sampled genomic sequences.

motif_enrichment_multi(
  blast_tbl,
  subject_genomes,
  test = "fisher",
  alternative = "two.sided",
  size,
  interval_width,
  motifs,
  max.mismatch = 0,
  min.mismatch = 0,
  ...
)

Arguments

blast_tbl

a blast_table.

subject_genomes

a character vector storing the file paths to the subject genomes that shall be used as subject references.

test
  • test = "fisher": Fisher's Exact Test for Count Data (see link[stats]{fisher.test} for details).

alternative

indicates the alternative hypothesis and must be one of "two.sided", "greater" or "less". You can specify just the initial letter. Only used in the 2 by 2 case.

size

total number of sequences that shall be sampled per subject genome.

interval_width

length of the sequence in which motifs shall be detected.

motifs

a character vector storing (case sensitive) motif sequences for which abundance in the sampled sequences shall be assessed.

max.mismatch

the maximum number of mismatching letters allowed (see matchPattern for details).

min.mismatch

the minimum number of mismatching letters allowed (see vcountPattern for details).

...

additional arguments passed to motif_enrichment.

Author

Hajk-Georg Drost