This function computes random sequences based on the alphabet and word length of an input sequence based on a multinomial model.
randomSeqs(seq, sample_size)
seq | a character vector storing a sequence as string for which random sequences shall be computed. |
---|---|
sample_size | a numeric value specifying the number of random sequences that shall be returned. |
a character vector of length sample_size
storing the random string objects.
This function enables you to create a test statistic for sequence comparisons that are based
on a multinomial model assumption. When spexifying the sample_size
argument
a vector of strings of corresponding sample_size
is being returned.
The random strings returned by randomSeqs
have the same word length and are drawn from
the same alphabet as the input sequence.
# a nucleotide example seq_example <- "ACCTGGAATTC" randomSeqs(seq = seq_example, sample_size = 10)#> [1] "TCGCGCACTCG" "AAGGCGTAGAG" "CCCGCCCGCTA" "GAGTTGTACCT" "TTAGTGGCTAC" #> [6] "TTAGAGCTAGT" "TTACAATTAGA" "CTCCGTTATCA" "CTTGCGATTCA" "TATCCGCTGGT"# a protein example seq_example <- "NPPAAM" randomSeqs(seq = seq_example, sample_size = 10)#> [1] "MAAANP" "AAMAPA" "PNAAAA" "MPNNPM" "PANANA" "ANNMPA" "PNANAA" "ANPAPN" #> [9] "NMPPAP" "MPNPNP"