Statistical Patterns in Genomic Sequences
provides functions for exploring and testing statistical properties and patterns in DNA sequences.
|License:||GPL (>= 2)|
This package provides a range of statistical tests for various properties of DNA and/or other genomic sequences. There are eight groups of functions:
- Testing for Chargaff's second parity rule in bacteria and other DNA sequences
- Testing for purine-pyrimidine parity in viruses and other DNA sequences
- Testing for Bernoulli/Markov processes
- Independence tests
- Tests for uniform distribution
- Simulation of random vectors, stochastic matrices, Bernoulli processes and Markov chains
- Functions for obtaining the complement or reverse complement of a DNA sequence
- Functions for counting words/k-mers and cylinders in symbolic sequences
The word/k-mer counting functions are general and can deal with arbitrary symbolic sequences, not only DNA sequences.
Functions which take a DNA sequence as input are able to work with sequences
SeqFastadna objects generated by the seqinr package.
Andrew Hart and Servet Mart<ed>nez
Maintainer: Andrew Hart <firstname.lastname@example.org>
Hart, A.G. and Mart<ed>nez, S. (2011) Statistical testing of Chargaff's second parity rule in bacterial genome sequences. Stoch. Models 27(2), 1–46.
Hart, A.G. and Mart<ed>nez, S. (2014) Markovianness and Conditional Independence in Annotated Bacterial DNA. Stat. Appl. Genet. Mol. Biol. 13(6), 693-716. arXiv:1311.4411 [q-bio.QM].
Hart, A.G. and Mart<ed>nez, S. (2012) A Gibbs approach to Chargaff's second parity rule. J. Stat. Phys. 146(2), 408-422. arXiv:1105.0685 [math.pr].