A Test for FirstOrder Markovianness
Description
Performs a test for firstorder Markovianness of a data series by inferring the sequence of i.i.d. U(0,1) random noise that might have generated it.
Usage
1  markov.test(x, type = c("lb", "ks"), method = "holm", lag = 20, ...)

Arguments
x 
the data series as a vector. 
type 
the procedures to use to test whether or not the disturbance series is independently and identically distributed on the unit interval. See ‘Details’. 
method 
the correction method to be used for adjusting the pvalues. It is identical to the

lag 
the number of lags to use when applying the LjungBox (portmanteau) test ( 
... 
parameters to pass on to functions that can be subsequently called. 
Details
This function tests a symbolic sequence for firstorder Markovianness (also known as the Markov property). It does this by reverseengineering the sequence to obtain a sample of the kind of output from a pseudorandom number generator that would have produced the observed sequence if it had been generated by simulating a Markov chain .The sample output is then tested to see if it is an independent and identically distributed siequence of uniform numbers in the range 01. this involves the application of at least two tests, one for independence and another for uniformity over the unit interval. One concludes that the sequence is Markovian if the sample output passes the tests (that is, all null hypotheses are accepted) and nonMarkovian otherwise.
The test is set up as follows:
H0: the sequence is firstorder Markov
H1: the sequence is not firstorder Markov
To simplify the use of the test, correction for multiple testing is carried out, which yields a single adjusted p value. If this pvalue is less than the significance level established for the test procedure, the null hypothesis of Markovianness is rejected. Otherwise, the null hypothesis should be accepted.
To correctly apply the test, use the type
argument to specify at least
one test of independence and one test of uniformity from the options displayed
in the following table.
Category  Function  Test 
Uniformity  ks.unif.test  KolmogorovSmirnov test for uniform$(0,1)$ data 
chisq.unif.test  Pearson's chisquared test for discrete uniform data,  
Independence  lb.test  LjungBox $Q$ test for uncorrelated data 
diffsign.test  signed difference test of independence  
turningpoint.test  turning point test of independence  
rank.test  rank test of independence  
If type
is not specified, lb.test
and
ks.unif.test
are used by default.
As this procedure performs multiple tests in order to assess if the sequence has
a Markovian dependence structure, it is necessary to adjust the pvalues for
multiple testing. By default, the HolmBonferroni method (holm
) is used
to correct for multiple testing, but this can be overridden via the
method
argument. The adjusted pvalues are displayed when the result of
the test is printed.
The smallest adjusted pvalue constitutes the overall pvalue for the test. If this pvalue is less than the significance level fixed for the test procedure, the null hypothesis of firstorder Markovianness is rejected. Otherwise, the null hypothesis should be accepted.
Value
A list with class "multiplehtest" containing the following components:
method 
the character string “Composite test for a firstorder (finite state) Markov chain”. 
statistics 
the values of the test statistic for all the tests. 
parameters 
parameters for all the tests. Exactly one parameter is
recorded for each test, for example, 
p.values 
pvalues of all the tests. 
methods 
a vector of character strings indicating what type of tests were performed. 
adjusted.p.values 
the adjusted pvalues. 
data.name 
a character string giving the name of the data. 
adjust.method 
indicates which correction method was used to adjust the pvalues for multiple testing. 
estimate 
the transition matrix estimated to fit a firstorder Markov chain to the data and used to generate the infered random disturbance. 
Note
Sometimes, a warning message advising that ties should not be present for the
KolmogorovSmirnov test can arise when analysing long sequences. If you do
receive this warning, it means that the results of the KolmogorovSmirnov test
(ks.unif.test
) should not be trusted. In this case, Pearson's
chisquared test (chisq.unif.test
) should be used instead of the
KolmogorovSmirnov test.
Author(s)
Andrew Hart and Servet Mart<ed>nez
References
Hart, A.G. and Mart<ed>nez, S. (2011) Statistical testing of Chargaff's second parity rule in bacterial genome sequences. Stoch. Models 27(2), 1–46.
Hart, A.G. and Mart<ed>nez, S. (2014) Markovianness and Conditional Independence in Annotated Bacterial DNA. Stat. Appl. Genet. Mol. Biol. 13(6), 693716. arXiv:1311.4411 [qbio.QM].
See Also
diid.test
, ks.unif.test
, chisq.unif.test
,
diffsign.test
, turningpoint.test
, rank.test
,
lb.test
Examples
1 2 3  #Generate an IID uniform DNA sequence
seq < simulateMarkovChain(5000, matrix(0.25, 4, 4), states=c("a","c","g","t"))
markov.test(seq)
