This function allows to quantify the statistical significance of a given pairwise alignment
between a query and subject sequence based on a sampled score distribution returned by randSeqDistr.
evalAlignment( seq, subject, sample_size, FUN, ..., fit_distr = "norm", gof = FALSE, comp_cores = 1 )
| seq | a character vector storing a sequence as string for which random sequences shall be computed. |
|---|---|
| subject | a character vector storing a subject sequence as string to which |
| sample_size | a numeric value specifying the number of random sequences that shall be returned. |
| FUN | a pairwise alignment function such as |
| ... | additional arguments that shall be passed to |
| fit_distr | a character string specifying the probability distribution that shall be fitted to the histogram
of scores returned by |
| gof | a logical value specifying whether or not godness of fit measures shall be printed to the console. |
| comp_cores | a numeric value specifying the number of cores you want to use for multicore processing. |
a p-value quantifying the statistical significance of the pairwise alignment of the input sequences.
The test statistic is developed using moment matching estimation of a given probability distribution that
is fitted to the alignment score vector returned by randSeqDistr. The corresponding distribution
parameters are estimated by the fitdist and the p-value quantifying the statistical
significance of the pairwise alignment of the input sequences is returned.
The following distributions can be fitted to the alignment score distribution:
A special case is fit_distr = "simple". This way simply the relative frequency of random scores that are greater than
the real alignment score is returned as p-value.
seq_example <- "MEDQVGFGF" subject_example <- "AYAIDPTPAF" # evaluate alignment p_val_align <- evalAlignment(seq_example, subject_example, 10, Biostrings::pairwiseAlignment, scoreOnly=TRUE, fit_distr = "norm", comp_cores = 1)