evalAlignment: Statistical Evaluation of a given Pairwise Alignment

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/evalAlignment.R

Description

This function allows to quantify the statistical significance of a given pairwise alignment between a query and subject sequence based on a sampled score distribution returned by randSeqDistr.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
evalAlignment(
  seq,
  subject,
  sample_size,
  FUN,
  ...,
  fit_distr = "norm",
  gof = FALSE,
  comp_cores = 1
)

Arguments

seq

a character vector storing a sequence as string for which random sequences shall be computed.

subject

a character vector storing a subject sequence as string to which seq shall be pairwise aligned.

sample_size

a numeric value specifying the number of random sequences that shall be returned.

FUN

a pairwise alignment function such as pairwiseAlignment or any other function that takes sequence arguments as first and second input.

...

additional arguments that shall be passed to FUN.

fit_distr

a character string specifying the probability distribution that shall be fitted to the histogram of scores returned by randSeqDistr. See fitdist when method = "mme" for details. A special case is fit_distr = "simple". This way simply the relative frequency of random scores that are greater than the real alignment score is returned as p-value.

gof

a logical value specifying whether or not godness of fit measures shall be printed to the console.

comp_cores

a numeric value specifying the number of cores you want to use for multicore processing.

Details

The test statistic is developed using moment matching estimation of a given probability distribution that is fitted to the alignment score vector returned by randSeqDistr. The corresponding distribution parameters are estimated by the fitdist and the p-value quantifying the statistical significance of the pairwise alignment of the input sequences is returned.

The following distributions can be fitted to the alignment score distribution:

A special case is fit_distr = "simple". This way simply the relative frequency of random scores that are greater than the real alignment score is returned as p-value.

Value

a p-value quantifying the statistical significance of the pairwise alignment of the input sequences.

Author(s)

Hajk-Georg Drost

See Also

randSeqDistr, randomSeqs, fitdist

Examples

1
2
3
4
5
6
7
seq_example <- "MEDQVGFGF"
subject_example <- "AYAIDPTPAF"
# evaluate alignment
p_val_align <- evalAlignment(seq_example, subject_example, 10,
                             Biostrings::pairwiseAlignment,
                             scoreOnly=TRUE, fit_distr = "norm",
                             comp_cores = 1)

HajkD/seqstats documentation built on April 26, 2020, 8:03 p.m.