Perform a goodnessoffit test to determine whether a data set appears to come from a specified probability distribution or if two data sets appear to come from the same distribution.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17  gofTest(y, ...)
## S3 method for class 'formula'
gofTest(y, data = NULL, subset,
na.action = na.pass, ...)
## Default S3 method:
gofTest(y, x = NULL,
test = ifelse(is.null(x), "sw", "ks"),
distribution = "norm", est.arg.list = NULL,
alternative = "two.sided", n.classes = NULL,
cut.points = NULL, param.list = NULL,
estimate.params = ifelse(is.null(param.list), TRUE, FALSE),
n.param.est = NULL, correct = NULL, digits = .Options$digits,
exact = NULL, ws.method = "normal scores", warn = TRUE,
data.name = NULL, data.name.x = NULL, parent.of.data = NULL,
subset.expression = NULL, ...)

y 
an object containing data for the goodnessoffit test. In the default
method, the argument 
data 
specifies an optional data frame, list or environment (or object coercible
by 
subset 
specifies an optional vector specifying a subset of observations to be used. 
na.action 
specifies a function which indicates what should happen when the data contain 
x 
numeric vector of values for the first sample in the case of a twosample
KolmogorovSmirnov goodnessoffit test ( 
test 
character string defining which goodnessoffit test to perform. Possible values are:

distribution 
a character string denoting the distribution abbreviation. See the help file for
When When When When When 
est.arg.list 
a list of arguments to be passed to the function estimating the distribution parameters.
For example, if When When When When 
alternative 
for the case when 
n.classes 
for the case when 
cut.points 
for the case when 
param.list 
for the case when 
estimate.params 
for the case when 
n.param.est 
for the case when 
correct 
for the case when 
digits 
for the case when 
exact 
for the case when 
ws.method 
for the case when 
warn 
logical scalar indicating whether to print a warning message when
observations with 
data.name 
character string indicating the name of the data used for argument 
data.name.x 
character string indicating the name of the data used for argument 
parent.of.data 
character string indicating the source of the data used for the goodnessoffit test. 
subset.expression 
character string indicating the expression used to subset the data. 
... 
additional arguments affecting the goodnessoffit test. 
ShapiroWilk GoodnessofFit Test (test="sw"
).
The ShapiroWilk goodnessoffit test (Shapiro and Wilk, 1965; Royston, 1992a)
is one of the most commonly used goodnessoffit tests for normality.
You can use it to test the following hypothesized distributions:
Normal, Lognormal, ThreeParameter Lognormal,
ZeroModified Normal, or
ZeroModified Lognormal (Delta).
In addition, you can also use it to test the null hypothesis of any
continuous distribution that is available (see the help file for
Distribution.df
, and see explanation below).
ShapiroWilk WStatistic and PValue for Testing Normality
Let X denote a random variable with cumulative distribution function (cdf)
F. Suppose we want to test the null hypothesis that F is the cdf of
a normal (Gaussian) distribution with some arbitrary mean
μ and standard deviation σ against the alternative hypothesis
that F is the cdf of some other distribution. The table below shows the
random variable for which F is the assumed cdf, given the value of the
argument distribution
.
Value of  Random Variable for  
distribution  Distribution Name  which F is the cdf 
"norm"  Normal  X 
"lnorm"  Lognormal (Logspace)  log(X) 
"lnormAlt"  Lognormal (Untransformed)  log(X) 
"lnorm3"  ThreeParameter Lognormal  log(Xγ) 
"zmnorm"  ZeroModified Normal  X  X > 0 
"zmlnorm"  ZeroModified Lognormal (Logspace)  log(X)  X > 0 
"zmlnormAlt"  ZeroModified Lognormal (Untransformed)  log(X)  X > 0 
Note that for the threeparameter lognormal distribution, the symbol γ denotes the threshold parameter.
Let \underline{x} = (x_1, x_2, …, x_n) denote the vector of
n ordered observations assumed to come from a normal
distribution.
The ShapiroWilk WStatistic
Shapiro and Wilk (1965) introduced the following statistic to test
the null hypothesis that F is the cdf of a normal distribution:
W = \frac{(∑_{i=1}^n a_i x_i)^2}{∑_{i=1}^n (x_i  \bar{x})^2} \;\;\;\;\;\; (1)
where the quantity a_i is the i'th element of the vector \underline{a} defined by:
\underline{a} = \frac{\underline{m}^T V^{1}}{[\underline{m}^T V^{1} V^{1} \underline{m}]^{1/2}} \;\;\;\;\;\; (2)
where T denotes the transpose operator, and \underline{m} is the vector of expected values and V is the variancecovariance matrix of the order statistics of a random sample of size n from a standard normal distribution. That is, the values of \underline{a} are the expected values of the standard normal order statistics weighted by their variancecovariance matrix, and normalized so that
\underline{a}^T \underline{a} = 1 \;\;\;\;\;\; (3)
It can be shown that the coefficients \underline{a} are antisymmetric, that is,
a_i = a_{ni+1} \;\;\;\;\;\; (4)
and for odd n,
a_{(n+1)/2} = 0 \;\;\;\;\;\; (5)
Now because
\bar{a} = \frac{1}{n} ∑_{i=1}^n a_i = 0 \;\;\;\;\;\ (6)
and
∑_{i=1}^n (a_i  \bar{a})^2 = ∑_{i=1}^n a_i^2 = \underline{a}^T \underline{a} = 1 \;\;\;\;\;\; (7)
the Wstatistic in Equation (1) is the same as the square of the sample productmoment correlation between the vectors \underline{a} and \underline{x}:
W = r(\underline{a}, \underline{x})^2 \;\;\;\;\;\; (8)
where
r(\underline{x}, \underline{y}) = \frac{∑_{i=1}^n (x_i  \bar{x})(y_i  \bar{y})}{[∑_{i=1}^n (x_i  \bar{x})^2 ∑_{i=1}^n (y_i  \bar{y})^2]^{1/2}} \;\;\;\;\;\;\; (9)
(see the R help file for cor
).
The ShapiroWilk Wstatistic is also simply the ratio of two estimators of variance, and can be rewritten as
W = \frac{\hat{σ}_{BLUE}^2}{\hat{σ}_{MVUE}^2} \;\;\;\;\;\; (10)
where the numerator is the square of the best linear unbiased estimate (BLUE) of the standard deviation, and the denominator is the minimum variance unbiased estimator (MVUE) of the variance:
\hat{σ}_{BLUE} = \frac{∑_{i=1}^n a_i x_i}{√{n1}} \;\;\;\;\;\; (11)
\hat{σ}_{MVUE}^2 = \frac{∑_{i=1}^n (x_i  \bar{x})^2}{n1} \;\;\;\;\;\; (12)
Small values of W indicate the null hypothesis is probably not true.
Shapiro and Wilk (1965) computed the values of the coefficients \underline{a}
and the percentage points for W (based on smoothing the empirical null
distribution of W) for sample sizes up to 50. Computation of the
Wstatistic for larger sample sizes can be cumbersome, since computation of
the coefficients \underline{a} requires storage of at least
n + [n(n+1)/2] reals followed by n \times n matrix inversion
(Royston, 1992a).
The ShapiroFrancia W'Statistic
Shapiro and Francia (1972) introduced a modification of the Wtest that
depends only on the expected values of the order statistics (\underline{m})
and not on the variancecovariance matrix (V):
W' = \frac{(∑_{i=1}^n b_i x_i)^2}{∑_{i=1}^n (x_i  \bar{x})^2} \;\;\;\;\;\; (13)
where the quantity b_i is the i'th element of the vector \underline{b} defined by:
\underline{b} = \frac{\underline{m}}{[\underline{m}^T \underline{m}]^{1/2}} \;\;\;\;\;\; (14)
Several authors, including Ryan and Joiner (1973), Filliben (1975), and Weisberg and Bingham (1975), note that the W'statistic is intuitively appealing because it is the squared Pearson correlation coefficient associated with a normal probability plot. That is, it is the squared correlation between the ordered sample values \underline{x} and the expected normal order statistics \underline{m}:
W' = r(\underline{b}, \underline{x})^2 = r(\underline{m}, \underline{x})^2 \;\;\;\;\;\; (15)
Shapiro and Francia (1972) present a table of empirical percentage points for W'
based on a Monte Carlo simulation. It can be shown that the asymptotic null
distributions of W and W' are identical, but convergence is very slow
(Verrill and Johnson, 1988).
The WeisbergBingham Approximation to the W'Statistic
Weisberg and Bingham (1975) introduced an approximation of the ShapiroFrancia
W'statistic that is easier to compute. They suggested using Blom scores
(Blom, 1958, pp.68–75) to approximate the element of \underline{m}:
\tilde{W}' = \frac{(∑_{i=1}^n c_i x_i)^2}{∑_{i=1}^n (x_i  \bar{x})^2} \;\;\;\;\;\; (16)
where the quantity c_i is the i'th element of the vector \underline{c} defined by:
\underline{c} = \frac{\underline{\tilde{m}}}{[\underline{\tilde{m}}^T \underline{\tilde{m}}]^{1/2}} \;\;\;\;\;\; (17)
and
\tilde{m}_i = Φ^{1}[\frac{i  (3/8)}{n + (1/4)}] \;\;\;\;\;\; (18)
and Φ denotes the standard normal cdf. That is, the values of the
elements of \underline{m} in Equation (14) are replaced with their estimates
based on the usual plotting positions for a normal distribution.
Royston's Approximation to the ShapiroWilk WTest
Royston (1992a) presents an approximation for the coefficients \underline{a}
necessary to compute the ShapiroWilk Wstatistic, and also a transformation
of the Wstatistic that has approximately a standard normal distribution
under the null hypothesis.
Noting that, up to a constant, the components of \underline{b} in Equation (14) and \underline{c} in Equation (17) differ from those of \underline{a} in Equation (2) mainly in the first and last two components, Royston (1992a) used the approximation \underline{c} as the basis for approximating \underline{a} using polynomial (quintic) regression analysis. For 4 ≤ n ≤ 1000, the approximation gave the following equations for the last two (and hence first two) components of \underline{a}:
\tilde{a}_n = c_n + 0.221157 y  0.147981 y^2  2.071190 y^3 + 4.434685 y^4  2.706056 y^5 \;\;\;\;\;\; (19)
\tilde{a}_{n1} = c_{n1} + 0.042981 y  0.293762 y^2  1.752461 y^3 + 5.682633 y^4  3.582633 y^5 \;\;\;\;\;\; (20)
where
y = √{n} \;\;\;\;\;\; (21)
The other components are computed as:
\tilde{a}_i = \frac{\tilde{m}_i}{√{η}} \;\;\;\;\;\; (22)
for i = 2, … , n1 if n ≤ 5, or i = 3, …, n2 if n > 5, where
η = \frac{\underline{\tilde{m}}^T \underline{\tilde{m}}  2 \tilde{m}_n^2}{1  2 \tilde{a}_n^2} \;\;\;\;\;\; (23)
if n ≤ 5, and
η = \frac{\underline{\tilde{m}}^T \underline{\tilde{m}}  2 \tilde{m}_n^2  2 \tilde{m}_{n1}^2}{1  2 \tilde{a}_n^2  2 \tilde{a}_{n1}^2} \;\;\;\;\;\; (24)
if n > 5.
Royston (1992a) found his approximation to \underline{a} to be accurate to
at least \pm 1 in the third decimal place over all values of i and
selected values of n, and also found that critical percentage points of
W based on his approximation agreed closely with the exact critical
percentage points calculated by Verrill and Johnson (1988).
Transformation of the Null Distribution of W to Normality
In order to compute a pvalue associated with a particular value of W,
Royston (1992a) approximated the distribution of (1W) by a
threeparameter lognormal distribution for 4 ≤ n ≤ 11,
and the upper half of the distribution of (1W) by a twoparameter
lognormal distribution for 12 ≤ n ≤ 2000.
Setting
z = \frac{w  μ}{σ} \;\;\;\;\;\; (25)
the pvalue associated with W is given by:
p = 1  Φ(z) \;\;\;\;\;\; (26)
For 4 ≤ n ≤ 11, the quantities necessary to compute z are given by:
w = log[γ  log(1  W)] \;\;\;\;\;\; (27)
γ = 2.273 + 0.459 n \;\;\;\;\;\; (28)
μ = 0.5440  0.39978 n + 0.025054 n^2  0.000671 n^3 \;\;\;\;\;\; (29)
σ = exp(1.3822  0.77857 n + 0.062767 n^2  0.0020322 n^3) \;\;\;\;\;\; (30)
For 12 ≤ n ≤ 2000, the quantities necessary to compute z are given by:
w = log(1  W) \;\;\;\;\;\; (31)
γ = log(n) \;\;\;\;\;\; (32)
μ = 1.5861  0.31082 y  0.083751 y^2 + 0.00038915 y^3 \;\;\;\;\;\; (33)
σ = exp(0.4803  0.082676 y + 0.0030302 y^2) \;\;\;\;\;\; (34)
For the last approximation when 12 ≤ n ≤ 2000, Royston (1992a) claims
this approximation is actually valid for sample sizes up to n = 5000.
Modification for the ThreeParameter Lognormal Distribution
When distribution="lnorm3"
, the function gofTest
assumes the vector
\underline{x} is a random sample from a
threeparameter lognormal distribution. It estimates the
threshold parameter via the zeroskewness method (see elnorm3
), and
then performs the ShapiroWilk goodnessoffit test for normality on
log(x\hat{γ}) where \hat{γ} is the estimated threshold
parmater. Because the threshold parameter has to be estimated, however, the
pvalue associated with the computed zstatistic will tend to be conservative
(larger than it should be under the null hypothesis). Royston (1992b) proposed
the following transformation of the zstatistic:
z' = \frac{z  μ_z}{σ_z} \;\;\;\;\;\; (35)
where for 5 ≤ n ≤ 11,
μ_z = 3.8267 + 2.8242 u  0.63673 u^2  0.020815 v \;\;\;\;\;\; (36)
σ_z = 4.9914 + 8.6724 u  4.27905 u^2 + 0.70350 u^3  0.013431 v \;\;\;\;\;\; (37)
and for 12 ≤ n ≤ 2000,
μ_z = 3.7796 + 2.4038 u  0.6675 u^2  0.082863 u^3  0.0037935 u^4  0.027027 v  0.0019887 vu \;\;\;\;\;\; (38)
σ_z = 2.1924  1.0957 u + 0.33737 u^2  0.043201 u^3 + 0.0019974 u^4  0.0053312 vu \;\;\;\;\;\; (39)
where
u = log(n) \;\;\;\;\;\; (40)
v = u (\hat{σ}  \hat{σ}^2) \;\;\;\;\;\; (41)
\hat{σ}^2 = \frac{1}{n1} ∑_{i=1}^n (y_i  \bar{y})^2 \;\;\;\;\;\; (42)
y_i = log(x_i  \hat{γ}) \;\;\;\;\;\; (43)
and γ denotes the threshold parameter. The pvalue associated with this test is then given by:
p = 1  Φ(z') \;\;\;\;\;\; (44)
Testing GoodnessofFit for Any Continuous Distribution
The function gofTest
extends the ShapiroWilk test to test for
goodnessoffit for any continuous distribution by using the idea of
Chen and Balakrishnan (1995), who proposed a general purpose approximate
goodnessoffit test based on the Cramervon Mises or AndersonDarling
goodnessoffit tests for normality. The function gofTest
modifies the
approach of Chen and Balakrishnan (1995) by using the same first 2 steps, and then
applying the ShapiroWilk test:
Let \underline{x} = x_1, x_2, …, x_n denote the vector of n ordered observations. Compute cumulative probabilities for each x_i based on the cumulative distribution function for the hypothesized distribution. That is, compute p_i = F(x_i, \hat{θ}) where F(x, θ) denotes the hypothesized cumulative distribution function with parameter(s) θ, and \hat{θ} denotes the estimated parameter(s).
Compute standard normal deviates based on the computed cumulative
probabilities:
y_i = Φ^{1}(p_i)
Perform the ShapiroWilk goodnessoffit test on the y_i's.
ShapiroFrancia GoodnessofFit Test (test="sf"
).
The ShapiroFrancia goodnessoffit test (Shapiro and Francia, 1972;
Weisberg and Bingham, 1975; Royston, 1992c) is also one of the most commonly
used goodnessoffit tests for normality. You can use it to test the following
hypothesized distributions:
Normal, Lognormal, ZeroModified Normal,
or ZeroModified Lognormal (Delta). In addition,
you can also use it to test the null hypothesis of any continuous distribution
that is available (see the help file for Distribution.df
). See the
section Testing GoodnessofFit for Any Continuous Distribution above for
an explanation of how this is done.
Royston's Transformation of the ShapiroFrancia W'Statistic to Normality
Equation (13) above gives the formula for the ShapiroFrancia W'statistic, and
Equation (16) above gives the formula for WeisbergBingham approximation to the
W'statistic (denoted \tilde{W}'). Royston (1992c) presents an algorithm
to transform the \tilde{W}'statistic so that its null distribution is
approximately a standard normal. For 5 ≤ n ≤ 5000,
Royston (1992c) approximates the distribution of (1\tilde{W}') by a
lognormal distribution. Setting
z = \frac{wμ}{σ} \;\;\;\;\;\; (45)
the pvalue associated with \tilde{W}' is given by:
p = 1  Φ(z) \;\;\;\;\;\; (46)
The quantities necessary to compute z are given by:
w = log(1  \tilde{W}') \;\;\;\;\;\; (47)
ν = log(n) \;\;\;\;\;\; (48)
u = log(ν)  ν \;\;\;\;\;\; (49)
μ = 1.2725 + 1.0521 u \;\;\;\;\;\; (50)
v = log(ν) + \frac{2}{ν} \;\;\;\;\;\; (51)
σ = 1.0308  0.26758 v \;\;\;\;\;\; (52)
Probability Plot Correlation Coefficient (PPCC) GoodnessofFit Test (test="ppcc"
).
The PPPCC goodnessoffit test (Filliben, 1975; Looney and Gulledge, 1985) can be
used to test the following hypothesized distributions:
Normal, Lognormal,
ZeroModified Normal, or
ZeroModified Lognormal (Delta). In addition,
you can also use it to test the null hypothesis of any continuous distribution that
is available (see the help file for Distribution.df
).
The function gofTest
computes the PPCC test
statistic using Blom plotting positions.
Filliben (1975) proposed using the correlation coefficient r from a normal probability plot to perform a goodnessoffit test for normality, and he provided a table of critical values for r under the for samples sizes between 3 and 100. Vogel (1986) provided an additional table for sample sizes between 100 and 10,000.
Looney and Gulledge (1985) investigated the characteristics of Filliben's
probability plot correlation coefficient (PPCC) test using the plotting position
formulas given in Filliben (1975), as well as three other plotting position
formulas: Hazen plotting positions, Weibull plotting positions, and Blom plotting
positions (see the help file for qqPlot
for an explanation of these
plotting positions). They concluded that the PPCC test based on Blom plotting
positions performs slightly better than tests based on other plotting positions, and
they provide a table of empirical percentage points for the distribution of r
based on Blom plotting positions.
The function gofTest
computes the PPCC test statistic r using Blom
plotting positions. It can be shown that the square of this statistic is
equivalent to the WeisbergBingham Approximation to the ShapiroFrancia
W'Test (Weisberg and Bingham, 1975; Royston, 1993). Thus the PPCC
goodnessoffit test is equivalent to the ShapiroFrancia goodnessoffit test.
ZeroSkew GoodnessofFit Test (test="skew"
).
The Zeroskew goodnessoffit test (D'Agostino, 1970) can be used to test the following hypothesized distributions: Normal, Lognormal, ZeroModified Normal, or ZeroModified Lognormal (Delta).
When test="skew"
, the function gofTest
tests the null hypothesis
that the skew of the distribution is 0:
H_0: √{β}_1 = 0 \;\;\;\;\;\; (53)
where
√{β}_1 = \frac{μ_3}{μ_2^{3/2}} \;\;\;\;\;\; (54)
and the quantity μ_r denotes the r'th moment about the mean (also called the r'th central moment). The quantity √{β_1} is called the coefficient of skewness, and is estimated by:
√{b}_1 = \frac{m_3}{m_2^{3/2}} \;\;\;\;\;\; (55)
where
m_r = \frac{1}{n} ∑_{i=1}^n (x_i  \bar{x})^r \;\;\;\;\;\; (56)
denotes the r'th sample central moment.
The possible alternative hypotheses are:
H_a: √{β}_1 \ne 0 \;\;\;\;\;\; (57)
H_a: √{β}_1 < 0 \;\;\;\;\;\; (58)
H_a: √{β}_1 > 0 \;\;\;\;\;\; (59)
which correspond to alternative="twosided"
, alternative="less"
, and
alternative="greater"
, respectively.
To test the null hypothesis of zero skew, D'Agostino (1970) derived an approximation to the distribution of √{b_1} under the null hypothesis of zeroskew, assuming the observations comprise a random sample from a normal (Gaussian) distribution. Based on D'Agostino's approximation, the statistic Z shown below is assumed to follow a standard normal distribution and is used to compute the pvalue associated with the test of H_0:
Z = δ \;\; log\{ \frac{Y}{α} + [(\frac{Y}{α})^2 + 1]^{1/2} \} \;\;\;\;\;\; (60)
where
Y = √{b_1} [\frac{(n+1)(n+3)}{6(n2)}]^{1/2} \;\;\;\;\;\; (61)
β_2 = \frac{3(n^2 + 27n  70)(n+1)(n+3)}{(n2)(n+5)(n+7)(n+9)} \;\;\;\;\;\; (62)
W^2 = 1 + √{2β_2  2} \;\;\;\;\;\; (63)
δ = 1 / √{log(W)} \;\;\;\;\;\; (64)
α = [2 / (W^2  1)]^{1/2} \;\;\;\;\;\; (65)
When the sample size n is at least 150, a simpler approximation may be
used in which Y in Equation (61) is assumed to follow a standard normal
distribution and is used to compute the pvalue associated with the hypothesis
test.
KolmogorovSmirnov GoodnessofFit Test (test="ks"
).
When test="ks"
, the function gofTest
calls the R function
ks.test
to compute the test statistic and pvalue. Note that for the
onesample case, the distribution parameters
should be prespecified and not estimated from the data, and if the distribution parameters
are estimated from the data you will receive a warning that this test is very conservative
(Type I error smaller than assumed; high Type II error) in this case.
ChiSquared GoodnessofFit Test (test="chisq"
).
The method used by gofTest
is a modification of what is used for chisq.test
.
If the hypothesized distribution function is completely specified, the degrees of
freedom are m1 where m denotes the number of classes. If any parameters
are estimated, the degrees of freedom depend on the method of estimation.
The function gofTest
follows the convention of computing
degrees of freedom as m1k, where k is the number of parameters estimated.
It can be shown that if the parameters are estimated by maximum likelihood, the degrees of
freedom are bounded between m1 and m1k. Therefore, especially when the
sample size is small, it is important to compare the test statistic to the chisquared
distribution with both m1 and m1k degrees of freedom. See
Kendall and Stuart (1991, Chapter 30) for a more complete discussion.
The distribution theory of chisquare statistics is a large sample theory.
The expected cell counts are assumed to be at least moderately large.
As a rule of thumb, each should be at least 5. Although authors have found this rule
to be conservative (especially when the class probabilities are not too different
from each other), the user should regard pvalues with caution when expected cell
counts are small.
WilkShapiro GoodnessofFit Test for Uniform [0, 1] Distribution (test="ws"
).
Wilk and Shapiro (1968) suggested this test in the context of jointly testing several independent samples for normality simultaneously. If p_1, p_2, …, p_n denote the pvalues associated with the test for normality of n independent samples, then under the null hypothesis that all n samples come from a normal distribution, the pvalues are a random sample of n observations from a Uniform [0,1] distribution, that is a Uniform distribution with minimum 0 and maximum 1. Wilk and Shapiro (1968) suggested two different methods for testing whether the pvalues come from a Uniform [0, 1] distribution:
Test Based on Normal Scores. Under the null hypothesis, the normal scores
Φ^{1}(p_1), Φ^{1}(p_2), …, Φ^{1}(p_n)
are a random sample of n observations from a standard normal distribution. Wilk and Shapiro (1968) denote the i'th normal score by
G_i = Φ^{1}(p_i) \;\;\;\;\;\; (66)
and note that under the null hypothesis, the quantity G defined as
G = \frac{1}{√{n}} \, ∑^n_{1}{G_i} \;\;\;\;\;\; (67)
has a standard normal distribution. Wilk and Shapiro (1968) were
interested in the alternative hypothesis that some of the n
independent samples did not come from a normal distribution and hence
would be associated with smaller pvalues than expected under the
null hypothesis, which translates to the alternative that the cdf for
the distribution of the pvalues is greater than the cdf of a
Uniform [0, 1] distribution (alternative="greater"
). In terms
of the test statistic G, this alternative hypothesis would
tend to make G smaller than expected, so the pvalue is given by
Φ(G). For the onesided lower alternative that the cdf for the
distribution of pvalues is less than the cdf for a Uniform [0, 1]
distribution, the pvalue is given by
p = 1  Φ(G) \;\;\;\;\;\; (68)
.
Test Based on ChiSquare Scores. Under the null hypothesis, the chisquare scores
2 \, log(p_1), 2 \, log(p_2), …, 2 \, log(p_n)
are a random sample of n observations from a chisquare distribution with 2 degrees of freedom (Fisher, 1950). Wilk and Shapiro (1968) denote the i'th chisquare score by
C_i = 2 \, log(p_i) \;\;\;\;\;\; (69)
and note that under the null hypothesis, the quantity C defined as
C = ∑^n_{1}{C_i} \;\;\;\;\;\; (70)
has a chisquare distribution with 2n degrees of freedom.
Wilk and Shapiro (1968) were
interested in the alternative hypothesis that some of the n
independent samples did not come from a normal distribution and hence
would be associated with smaller pvalues than expected under the
null hypothesis, which translates to the alternative that the cdf for
the distribution of the pvalues is greater than the cdf of a
Uniform [0, 1] distribution (alternative="greater"
). In terms
of the test statistic C, this alternative hypothesis would
tend to make C larger than expected, so the pvalue is given by
p = 1  F_{2n}(C) \;\;\;\;\;\; (71)
where F_2n denotes the cumulative distribution function of the chisquare distribution with 2n degrees of freedom. For the onesided lower alternative that the cdf for the distribution of pvalues is less than the cdf for a Uniform [0, 1] distribution, the pvalue is given by
p = F_{2n}(C) \;\;\;\;\;\; (72)
a list of class "gof"
containing the results of the goodnessoffit test, unless
the twosample
KolmogorovSmirnov test is used, in which case the value is a list of
class "gofTwoSample"
. Objects of class "gof"
and "gofTwoSample"
have special printing and plotting methods. See the help files for gof.object
and gofTwoSample.object
for details.
The ShapiroWilk test (Shapiro and Wilk, 1965) and the ShapiroFrancia test (Shapiro and Francia, 1972) are probably the two most commonly used hypothesis tests to test departures from normality. The ShapiroWilk test is most powerful at detecting shorttailed (platykurtic) and skewed distributions, and least powerful against symmetric, moderately longtailed (leptokurtic) distributions. Conversely, the ShapiroFrancia test is more powerful against symmetric longtailed distributions and less powerful against shorttailed distributions (Royston, 1992b; 1993).
The zeroskew goodnessoffit test for normality is one of several tests that have
been proposed to test the assumption of a normal distribution (D'Agostino, 1986b).
This test has been included mainly because it is called by elnorm3
.
Ususally, the ShapiroWilk or ShapiroFrancia test is preferred to this test, unless
the direction of the alternative to normality (e.g., positive skew) is known
(D'Agostino, 1986b, pp. 405–406).
Kolmogorov (1933) introduced a goodnessoffit test to test the hypothesis that a random sample of n observations x comes from a specific hypothesized distribution with cumulative distribution function H. This test is now usually called the onesample KolmogorovSmirnov goodnessoffit test. Smirnov (1939) introduced a goodnessoffit test to test the hypothesis that a random sample of n observations x comes from the same distribution as a random sample of m observations y. This test is now usually called the twosample KolmogorovSmirnov goodnessoffit test. Both tests are based on the maximum vertical distance between two cumulative distribution functions. For the onesample problem with a small sample size, the KolmogorovSmirnov test may be preferred over the chisquared goodnessoffit test since the KStest is exact, while the chisquared test is based on an asymptotic approximation.
The chisquared test, introduced by Pearson in 1900, is the oldest and best known goodnessoffit test. The idea is to reduce the goodnessoffit problem to a multinomial setting by comparing the observed cell counts with their expected values under the null hypothesis. Grouping the data sacrifices information, especially if the hypothesized distribution is continuous. On the other hand, chisquared tests can be be applied to any type of variable: continuous, discrete, or a combination of these.
The WilkShapiro (1968) tests for a Uniform [0, 1] distribution were introduced in the context
of testing whether several independent samples all come from normal distributions, with
possibly different means and variances. The function gofGroupTest
extends
this idea to allow you to test whether several independent samples come from the same
distribution (e.g., gamma, extreme value, etc.), with possibly different parameters.
In practice, almost any goodnessoffit test will not reject the null hypothesis
if the number of observations is relatively small. Conversely, almost any goodnessoffit
test will reject the null hypothesis if the number of observations is very large,
since “real” data are never distributed according to any theoretical distribution
(Conover, 1980, p.367). For most cases, however, the distribution of “real” data
is close enough to some theoretical distribution that fairly accurate results may be
provided by assuming that particular theoretical distribution. One way to asses the
goodness of the fit is to use goodnessoffit tests. Another way is to look at
quantilequantile (QQ) plots (see qqPlot
).
Steven P. Millard (EnvStats@ProbStatInfo.com)
Birnbaum, Z.W., and F.H. Tingey. (1951). OneSided Confidence Contours for Probability Distribution Functions. Annals of Mathematical Statistics 22, 592596.
Blom, G. (1958). Statistical Estimates and Transformed Beta Variables. John Wiley and Sons, New York.
Conover, W.J. (1980). Practical Nonparametric Statistics. Second Edition. John Wiley and Sons, New York.
Dallal, G.E., and L. Wilkinson. (1986). An Analytic Approximation to the Distribution of Lilliefor's Test for Normality. The American Statistician 40, 294296.
D'Agostino, R.B. (1970). Transformation to Normality of the Null Distribution of g1. Biometrika 57, 679681.
D'Agostino, R.B. (1971). An Omnibus Test of Normality for Moderate and Large Size Samples. Biometrika 58, 341348.
D'Agostino, R.B. (1986b). Tests for the Normal Distribution. In: D'Agostino, R.B., and M.A. Stephens, eds. Goodnessof Fit Techniques. Marcel Dekker, New York.
D'Agostino, R.B., and E.S. Pearson (1973). Tests for Departures from Normality. Empirical Results for the Distributions of b2 and √{b1}. Biometrika 60(3), 613622.
D'Agostino, R.B., and G.L. Tietjen (1973). Approaches to the Null Distribution of √{b1}. Biometrika 60(1), 169173.
Fisher, R.A. (1950). Statistical Methods for Research Workers. 11'th Edition. Hafner Publishing Company, New York, pp.99100.
Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.
Kendall, M.G., and A. Stuart. (1991). The Advanced Theory of Statistics, Volume 2: Inference and Relationship. Fifth Edition. Oxford University Press, New York.
Kim, P.J., and R.I. Jennrich. (1973). Tables of the Exact Sampling Distribution of the Two Sample KolmogorovSmirnov Criterion. In Harter, H.L., and D.B. Owen, eds. Selected Tables in Mathematical Statistics, Vol. 1. American Mathematical Society, Providence, Rhode Island, pp.79170.
Kolmogorov, A.N. (1933). Sulla determinazione empirica di una legge di distribuzione. Giornale dell' Istituto Italiano degle Attuari 4, 8391.
Marsaglia, G., W.W. Tsang, and J. Wang. (2003). Evaluating Kolmogorov's distribution. Journal of Statistical Software, 8(18). http://www.jstatsoft.org/v08/i18/.
Moore, D.S. (1986). Tests of ChiSquared Type. In D'Agostino, R.B., and M.A. Stephens, eds. Goodnessof Fit Techniques. Marcel Dekker, New York, pp.6395.
Pomeranz, J. (1973). Exact Cumulative Distribution of the KolmogorovSmirnov Statistic for Small Samples (Algorithm 487). Collected Algorithms from ACM ??, ??????.
Royston, J.P. (1992a). Approximating the ShapiroWilk WTest for NonNormality. Statistics and Computing 2, 117119.
Royston, J.P. (1992b). Estimation, Reference Ranges and Goodness of Fit for the ThreeParameter LogNormal Distribution. Statistics in Medicine 11, 897912.
Royston, J.P. (1992c). A PocketCalculator Algorithm for the ShapiroFrancia Test of NonNormality: An Application to Medicine. Statistics in Medicine 12, 181184.
Royston, P. (1993). A Toolkit for Testing for NonNormality in Complete and Censored Samples. The Statistician 42, 3743.
Ryan, T., and B. Joiner. (1973). Normal Probability Plots and Tests for Normality. Technical Report, Pennsylvannia State University, Department of Statistics.
Shapiro, S.S., and R.S. Francia. (1972). An Approximate Analysis of Variance Test for Normality. Journal of the American Statistical Association 67(337), 215219.
Shapiro, S.S., and M.B. Wilk. (1965). An Analysis of Variance Test for Normality (Complete Samples). Biometrika 52, 591611.
Smirnov, N.V. (1939). Estimate of Deviation Between Empirical Distribution Functions in Two Independent Samples. Bulletin Moscow University 2(2), 316.
Smirnov, N.V. (1948). Table for Estimating the Goodness of Fit of Empirical Distributions. Annals of Mathematical Statistics 19, 279281.
Stephens, M.A. (1970). Use of the KolmogorovSmirnov, Cramervon Mises and Related Statistics Without Extensive Tables. Journal of the Royal Statistical Society, Series B, 32, 115122.
Stephens, M.A. (1986a). Tests Based on EDF Statistics. In D'Agostino, R. B., and M.A. Stevens, eds. GoodnessofFit Techniques. Marcel Dekker, New York.
Verrill, S., and R.A. Johnson. (1987). The Asymptotic Equivalence of Some Modified ShapiroWilk Statistics – Complete and Censored Sample Cases. The Annals of Statistics 15(1), 413419.
Verrill, S., and R.A. Johnson. (1988). Tables and LargeSample Distribution Theory for CensoredData Correlation Statistics for Testing Normality. Journal of the American Statistical Association 83, 11921197.
Weisberg, S., and C. Bingham. (1975). An Approximate Analysis of Variance Test for NonNormality Suitable for Machine Calculation. Technometrics 17, 133134.
Wilk, M.B., and S.S. Shapiro. (1968). The Joint Assessment of Normality of Several Independent Samples. Technometrics, 10(4), 825839.
Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. PrenticeHall, Upper Saddle River, NJ.
rosnerTest
, gof.object
, print.gof
,
plot.gof
,
shapiro.test
, ks.test
, chisq.test
,
Normal, Lognormal, Lognormal3,
ZeroModified Normal, ZeroModified Lognormal (Delta),
enorm
, elnorm
, elnormAlt
,
elnorm3
, ezmnorm
, ezmlnorm
,
ezmlnormAlt
, qqPlot
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328  # Generate 20 observations from a gamma distribution with
# parameters shape = 2 and scale = 3 then run various
# goodnessoffit tests.
# (Note: the call to set.seed lets you reproduce this example.)
set.seed(47)
dat < rgamma(20, shape = 2, scale = 3)
# ShapiroWilk generalized goodnessoffit test
#
gof.list < gofTest(dat, distribution = "gamma")
gof.list
#Results of GoodnessofFit Test
#
#
#Test Method: ShapiroWilk GOF Based on
# Chen & Balakrisnan (1995)
#
#Hypothesized Distribution: Gamma
#
#Estimated Parameter(s): shape = 1.909462
# scale = 4.056819
#
#Estimation Method: mle
#
#Data: dat
#
#Sample Size: 20
#
#Test Statistic: W = 0.9834958
#
#Test Statistic Parameter: n = 20
#
#Pvalue: 0.970903
#
#Alternative Hypothesis: True cdf does not equal the
# Gamma Distribution.
dev.new()
plot(gof.list)
#
# Redo the example above, but use the biascorrected mle
gofTest(dat, distribution = "gamma",
est.arg.list = list(method = "bcmle"))
#Results of GoodnessofFit Test
#
#
#Test Method: ShapiroWilk GOF Based on
# Chen & Balakrisnan (1995)
#
#Hypothesized Distribution: Gamma
#
#Estimated Parameter(s): shape = 1.656376
# scale = 4.676680
#
#Estimation Method: bcmle
#
#Data: dat
#
#Sample Size: 20
#
#Test Statistic: W = 0.9834346
#
#Test Statistic Parameter: n = 20
#
#Pvalue: 0.9704046
#
#Alternative Hypothesis: True cdf does not equal the
# Gamma Distribution.
#
# KomogorovSmirnov goodnessoffit test (prespecified parameters)
#
gofTest(dat, test = "ks", distribution = "gamma",
param.list = list(shape = 2, scale = 3))
#Results of GoodnessofFit Test
#
#
#Test Method: KolmogorovSmirnov GOF
#
#Hypothesized Distribution: Gamma(shape = 2, scale = 3)
#
#Data: dat
#
#Sample Size: 20
#
#Test Statistic: ks = 0.2313878
#
#Test Statistic Parameter: n = 20
#
#Pvalue: 0.2005083
#
#Alternative Hypothesis: True cdf does not equal the
# Gamma(shape = 2, scale = 3)
# Distribution.
#
# Chisquared goodnessoffit test (estimated parameters)
#
gofTest(dat, test = "chisq", distribution = "gamma", n.classes = 4)
#Results of GoodnessofFit Test
#
#
#Test Method: Chisquare GOF
#
#Hypothesized Distribution: Gamma
#
#Estimated Parameter(s): shape = 1.909462
# scale = 4.056819
#
#Estimation Method: mle
#
#Data: dat
#
#Sample Size: 20
#
#Test Statistic: Chisquare = 1.2
#
#Test Statistic Parameter: df = 1
#
#Pvalue: 0.2733217
#
#Alternative Hypothesis: True cdf does not equal the
# Gamma Distribution.
#
# Clean up
rm(dat, gof.list)
graphics.off()
#
# Example 102 of USEPA (2009, page 1014) gives an example of
# using the ShapiroWilk test to test the assumption of normality
# for nickel concentrations (ppb) in groundwater collected over
# 4 years. The data for this example are stored in
# EPA.09.Ex.10.1.nickel.df.
EPA.09.Ex.10.1.nickel.df
# Month Well Nickel.ppb
#1 1 Well.1 58.8
#2 3 Well.1 1.0
#3 6 Well.1 262.0
#4 8 Well.1 56.0
#5 10 Well.1 8.7
#6 1 Well.2 19.0
#7 3 Well.2 81.5
#8 6 Well.2 331.0
#9 8 Well.2 14.0
#10 10 Well.2 64.4
#11 1 Well.3 39.0
#12 3 Well.3 151.0
#13 6 Well.3 27.0
#14 8 Well.3 21.4
#15 10 Well.3 578.0
#16 1 Well.4 3.1
#17 3 Well.4 942.0
#18 6 Well.4 85.6
#19 8 Well.4 10.0
#20 10 Well.4 637.0
# Test for a normal distribution:
#
gof.list < gofTest(Nickel.ppb ~ 1,
data = EPA.09.Ex.10.1.nickel.df)
gof.list
#Results of GoodnessofFit Test
#
#
#Test Method: ShapiroWilk GOF
#
#Hypothesized Distribution: Normal
#
#Estimated Parameter(s): mean = 169.5250
# sd = 259.7175
#
#Estimation Method: mvue
#
#Data: Nickel.ppb
#
#Data Source: EPA.09.Ex.10.1.nickel.df
#
#Sample Size: 20
#
#Test Statistic: W = 0.6788888
#
#Test Statistic Parameter: n = 20
#
#Pvalue: 2.17927e05
#
#Alternative Hypothesis: True cdf does not equal the
# Normal Distribution.
dev.new()
plot(gof.list)
#
# Test for a lognormal distribution:
#
gofTest(Nickel.ppb ~ 1,
data = EPA.09.Ex.10.1.nickel.df,
dist = "lnorm")
#Results of GoodnessofFit Test
#
#
#Test Method: ShapiroWilk GOF
#
#Hypothesized Distribution: Lognormal
#
#Estimated Parameter(s): meanlog = 3.918529
# sdlog = 1.801404
#
#Estimation Method: mvue
#
#Data: Nickel.ppb
#
#Data Source: EPA.09.Ex.10.1.nickel.df
#
#Sample Size: 20
#
#Test Statistic: W = 0.978946
#
#Test Statistic Parameter: n = 20
#
#Pvalue: 0.9197735
#
#Alternative Hypothesis: True cdf does not equal the
# Lognormal Distribution.
#
# Test for a lognormal distribution, but use the
# Mean and CV parameterization:
#
gofTest(Nickel.ppb ~ 1,
data = EPA.09.Ex.10.1.nickel.df,
dist = "lnormAlt")
#Results of GoodnessofFit Test
#
#
#Test Method: ShapiroWilk GOF
#
#Hypothesized Distribution: Lognormal
#
#Estimated Parameter(s): mean = 213.415628
# cv = 2.809377
#
#Estimation Method: mvue
#
#Data: Nickel.ppb
#
#Data Source: EPA.09.Ex.10.1.nickel.df
#
#Sample Size: 20
#
#Test Statistic: W = 0.978946
#
#Test Statistic Parameter: n = 20
#
#Pvalue: 0.9197735
#
#Alternative Hypothesis: True cdf does not equal the
# Lognormal Distribution.
#
# Clean up
rm(gof.list)
graphics.off()
#
# Generate 20 observations from a normal distribution with mean=3 and sd=2, and
# generate 10 observaions from a normal distribution with mean=2 and sd=2 then
# test whether these sets of observations come from the same distribution.
# (Note: the call to set.seed simply allows you to reproduce this example.)
set.seed(300)
dat1 < rnorm(20, mean = 3, sd = 2)
dat2 < rnorm(10, mean = 1, sd = 2)
gofTest(x = dat1, y = dat2, test = "ks")
#Results of GoodnessofFit Test
#
#
#Test Method: 2Sample KS GOF
#
#Hypothesized Distribution: Equal
#
#Data: x = dat1
# y = dat2
#
#Sample Sizes: n.x = 20
# n.y = 10
#
#Test Statistic: ks = 0.7
#
#Test Statistic Parameters: n = 20
# m = 10
#
#Pvalue: 0.001669561
#
#Alternative Hypothesis: The cdf of 'dat1' does not equal
# the cdf of 'dat2'.
#
# Clean up
rm(dat1, dat2)

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
Please suggest features or report bugs with the GitHub issue tracker.
All documentation is copyright its authors; we didn't write any of that.