Test of CSPR for Dinucleotides Under Gibbs Distribution

Share:

Description

Performs a test of Chargaff's second parity rule (CSPR) for dinucleotides under a Gibbsian assumption on the DNA sequence, which was proposed in Hart and Mart<ed>nez (2012).

Usage

1
chargaff.gibbs.test(x, maxLag=200)

Arguments

x

either a character vector representing a DNA sequence in which each element contains a single nucleotide, or a DNA sequence stored using the SeqFastadna class from the seqinr package.

maxLag

The maximum number of lags (cylinder lengths) to use in computing variances. the default value is 200.

Details

This function performs a test of Chargaff's second parity rule for dinucleotides assuming the DNA sequence was generated by a Gibbs distribution. Under the null hypothesis, the test statistic eta is asymptotically chi-squared on 5 degrees of freedom.

The test is set up as follows:

H0: the sequence complies with CSPR for dinucleotides
H1: the sequence does not comply with CSPR for dinucleotides

Value

A list with class "htest" containing the following components:

statistic

the value of the test statistic.

p.value

the p-value of the test.

method

a character string indicating what type of test was performed.

data.name

a character string giving the name of the data.

FHat

the 5-element vector nF^ used in calculating the test statistic.

pairs

the stochastic matrix of dinucleotide counts used to derive nF^.

v

The asymptotic covariance matrix of nF^.

n

the length of the DNA sequence.

cutoff

the actual number of lags used by the algorithm to calculate covariances.

maxCutoff

the value specified for the maxLag parameter when the test was performed.

Author(s)

Andrew Hart and Servet Mart<ed>nez

References

Hart, A.G. and Mart<ed>nez, S. (2012) A Gibbs approach to Chargaff's second parity rule. J. Stat. Phys. 146(2), 408-422.

See Also

chargaff0.test, chargaff1.test, chargaff2.test, agct.test, ag.test

Examples

1
2
3
4
5
6
7
8
9
#Demonstration on real bacterial sequence
data(nanoarchaeum)
chargaff.gibbs.test(nanoarchaeum)

#Simulate synthetic DNA sequence that does not satisfy Chargaff's second parity rule
trans.mat <- matrix(c(.4, .1, .4, .1, .2, .1, .6, .1, .4, .1, .3, .2, .1, .2, .4, .3), 
ncol=4, byrow=TRUE)
seq <- simulateMarkovChain(500000, trans.mat, states=c("a", "c", "g", "t"))
chargaff.gibbs.test(seq)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.