Data Driven Smooth Test for Normality

Share:

Description

Performs data driven smooth test for composite hypothesis of normality.

Usage

1
2
ddst.norm.test(x, base = ddst.base.legendre, c = 100, B = 1000, compute.p = F, 
    Dmax = 5, ...)

Arguments

x

a (non-empty) numeric vector of data values.

base

a function which returns orthogonal system, might be ddst.base.legendre for Legendre polynomials or ddst.base.cos for cosine system, see package description.

c

a parameter for model selection rule, see package description.

B

an integer specifying the number of replicates used in p-value computation.

compute.p

a logical value indicating whether to compute a p-value.

Dmax

an integer specifying the maximum number of coordinates, only for advanced users.

...

further arguments.

Details

Null density is given by $f(z;gamma)=1/(sqrt(2 pi)gamma_2) exp(-(z-gamma_1)^2/(2 gamma_2^2))$ for z in R.

We model alternatives similarly as in Kallenberg and Ledwina (1997 a,b) using Legendre's polynomials or cosine basis. The parameter $gamma=(gamma_1,gamma_2)$ is estimated by $tilde gamma=(tilde gamma_1,tilde gamma_2)$, where $tilde gamma_1=1/n sum_i=1^n Z_i$ and $tilde gamma_2 = 1/(n-1) sum_i=1^n-1(Z_n:i+1-Z_n:i)(H_i+1-H_i)$, while $Z_n:1<= ... <= Z_n:n$ are ordered values of $Z_1, ..., Z_n$ and $H_i= phi^-1((i-3/8)(n+1/4))$, cf. Chen and Shapiro (1995).

The above yields auxiliary test statistic $W_k^*(tilde gamma)$ described in details in Janic and Ledwina (2008), in case when Legendre's basis is applied. The pertaining matrix $[I^*(tilde gamma)]^-1$ does not depend on $tilde gamma$ and is calculated for succeding dimensions k using some recurrent relations for Legendre's polynomials and is computed in a numerical way in case of cosine basis. In the implementation of $T^*$ the default value of c is set to be 100. Therefore, in practice, $T^*$ is Schwarz-type criterion. See Inglot and Ledwina (2006) as well as Janic and Ledwina (2008) for comments. The resulting data driven test statistic for normality is $W_T^*=W_T^*(tilde gamma)$.

For more details see: http://www.biecek.pl/R/ddst/description.pdf.

Value

An object of class htest

statistic

the value of the test statistic.

parameter

the number of choosen coordinates (k).

method

a character string indicating the parameters of performed test.

data.name

a character string giving the name(s) of the data.

p.value

the p-value for the test, computed only if compute.p=T.

Author(s)

Przemyslaw Biecek and Teresa Ledwina

References

Chen, L., Shapiro, S.S. (1995). An alternative test for normality based on normalized spacings. J. Statist. Comput. Simulation 53, 269–288.

Inglot, T., Ledwina, T. (2006). Towards data driven selection of a penalty function for data driven Neyman tests. Linear Algebra and its Appl. 417, 579–590.

Janic, A. and Ledwina, T. (2008). Data-driven tests for a location-scale family revisited. J. Statist. Theory. Pract. Special issue on Modern Goodness of Fit Methods. accepted..

Kallenberg, W.C.M., Ledwina, T. (1997 a). Data driven smooth tests for composite hypotheses: Comparison of powers. J. Statist. Comput. Simul. 59, 101–121.

Kallenberg, W.C.M., Ledwina, T. (1997 b). Data driven smooth tests when the hypothesis is composite. J. Amer. Statist. Assoc. 92, 1094–1104.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# for given vector of 19 numbers
z = c(13.41, 6.04, 1.26, 3.67, -4.54, 2.92, 0.44, 12.93, 6.77, 10.09, 
   4.10, 4.04, -1.97, 2.17, -5.38, -7.30, 4.75, 5.63, 8.84)
ddst.norm.test(z, compute.p=TRUE)

# H0 is true
z = rnorm(80)
ddst.norm.test(z, compute.p=TRUE)

# H0 is false
z = rexp(80,4)
ddst.norm.test(z, B=5000, compute.p=TRUE)