This function computes random sequences based on the alphabet and word length of an input sequence based on a multinomial model.

randomSeqs(seq, sample_size)

Arguments

seq

a character vector storing a sequence as string for which random sequences shall be computed.

sample_size

a numeric value specifying the number of random sequences that shall be returned.

Value

a character vector of length sample_size storing the random string objects.

Details

This function enables you to create a test statistic for sequence comparisons that are based on a multinomial model assumption. When spexifying the sample_size argument a vector of strings of corresponding sample_size is being returned.

The random strings returned by randomSeqs have the same word length and are drawn from the same alphabet as the input sequence.

See also

Examples

# a nucleotide example seq_example <- "ACCTGGAATTC" randomSeqs(seq = seq_example, sample_size = 10)
#> [1] "TCGCGCACTCG" "AAGGCGTAGAG" "CCCGCCCGCTA" "GAGTTGTACCT" "TTAGTGGCTAC" #> [6] "TTAGAGCTAGT" "TTACAATTAGA" "CTCCGTTATCA" "CTTGCGATTCA" "TATCCGCTGGT"
# a protein example seq_example <- "NPPAAM" randomSeqs(seq = seq_example, sample_size = 10)
#> [1] "MAAANP" "AAMAPA" "PNAAAA" "MPNNPM" "PANANA" "ANNMPA" "PNANAA" "ANPAPN" #> [9] "NMPPAP" "MPNPNP"