Table 4.2 of @d1986goodness, and accompanying function p.val.tab
can be used for any "Case 0" test of goodness of fit, that is to say, whenever there are no parameters to be estimated. This includes, for example, uniform on a known interval (typically $(0,1)$), or normal with known mean and SD. Here are a couple of examples:
library(edfr) set.seed(457299) w=rbeta(20,2,1) p.val.tab(w,calc=punif) x=rnorm(20,10,3) p.val.tab(x,calc=pnorm,10,3)
Vector w
should fail uniformity, since it is actually drawn from an asymmetric beta distribution, but vector x
should pass normality with mean 10 and SD 3. These are the case. Note that the null hypothesis for x
is not just "normal", but "normal with specified mean and SD", so that if we misspecify the mean and SD, normality should fail:
p.val.tab(x,calc=pnorm,20,5)
As indeed it does. Below, we compare the results from Table 4.2 of @d1986goodness with simulated P-values.
Testing for distributions with estimated mean and SD, referred to in @d1986goodness as "Case 3", requires different tables. We also investigate this issue below.
Function sim.stat
obtains a simulation distribution of one of the EDF test statistics. For input, it needs: the name of an EDF test statistic, the sample size, the number of simulations (defaults to 10,000), the distribution to simulate from (an r-function), and the distribution to calculate the test statistic for (a p-function). The last two are different partly for annoying technical reasons, and partly as a way of simulating power (by allowing the tested and simulated data to have different distributions).
Let's start by simulating from data that are uniform on $(0,1)$. We'll start with samples of size 20. We have to do each test statistic in turn, so let's start with $D$:
n=20 D=sim.stat("d",n=n,sim_dist=runif,calc_dist=punif)
To compare with Table 4.2, we need the modified form of the statistic, which for $D$ is $D(\sqrt{n}+0.12+0.11/\sqrt{n})$:
Dmod=D*(sqrt(n)+0.12+0.11/sqrt(n))
The cumulative 75\%, 85\%, 90\%, 95\%, 97.5\%, 99\%, 99.5\% and 99.9\% points of the distribution of D-modified are given in Table 4.2 as 1.019, 1.138, 1.224, 1.358, 1.490, 1.628, 1.731, 1.950 respectively, so we compare thus, noting that ecdf
will give us the proportion of Dmod
values less than or equal to these critical values:
cv=c(1.019, 1.138, 1.224, 1.358, 1.490, 1.628, 1.731, 1.950) fn.Dmod=ecdf(Dmod) fn.Dmod(cv)
These should be close to 0.75, 0.85, 0.90, 0.95, 0.975, 0.99, 0.995, 0.999 respectively, which they are. The other way to look at these is to look at the appropriate percentiles of the simulated distribution, and check that they are close to the values in Table 4.2:
lev=c(0.75,0.85,0.90,0.95,0.975,0.99,0.995,0.999) q=quantile(Dmod,probs=lev) q
These seem to be acceptably close to the values given in Table 4.2, apart perhaps from the values at the upper end. The percent relative errors are:
(q-cv)/cv*100
Next, let's try statistic $W^2$:
W2=sim.stat("w2",n=n,sim_dist=runif,calc_dist=punif)
W2.mod=(W2-0.4/n+0.6/n^2)/(1+1/n) cv=c(0.209,0.284,0.347,0.461,0.581,0.743,0.869,1.167) fn.W2=ecdf(W2.mod) fn.W2(cv)
These seem not very good: they are all too high, except in the upper tail.
q=quantile(W2,probs=lev) q
(q-cv)/cv*100
However, the simulated quantiles are, except in the far tail, within about 1\% of the tabulated values.
Finally, let's repeat the process for $A^2$, which has the advantage that no modification is needed:
A2=sim.stat("a2",n=n,sim_dist=runif,calc_dist=punif) cv=c(1.248,1.610,1.933,2.492,3.070,3.880,4.5,6) fn.A2=ecdf(A2) fn.A2(cv) q=quantile(A2,probs=lev) q (q-cv)/cv*100
This shows good agreement with Table 4.2, except in the extreme upper tail, which will be poorly estimated in any case. (Small proportions are hard to estimate with good relative accuracy, unless many simulations are used, more than here.)
The Case 0 tables can be used for any distribution with known parameters, not necessarily uniform. For example, normal with mean 10 and SD 3. We'll investigate $A^2$ again:
A2.norm=sim.stat("a2",n=n,nsim=10000,sim_dist=rnorm,calc_dist=pnorm,10,3)
fn.A2.norm=ecdf(A2.norm) fn.A2.norm(cv) q=quantile(A2.norm,probs=lev) q (q-cv)/cv*100
The agreement is decent, except in the upper tail, where the errors are of similar size but in the opposite direction as when simulating from the uniform.
What happens if we use Case 0 tables for a situation in which we have estimated parameters? Let's illustrate with testing for normality with mean and SD unknown, so that the null hypothesis is "the underlying distribution is normal with unspecified mean and SD", against an alternative of "it is not normal".
The procedure we follow is this:
and then we repeat the above many times, and compare the simulated distribution of our statistic with Table 4.2.
We'll write a function to mimic the functionality of sim.stat
for normal distributions, but with the sample mean and SD used instead of the unknown population mean and SD:
sim.stat.est.1=function(teststat,n,true.mean,true.sd) { x=rnorm(n,true.mean,true.sd) calc.stat(teststat,x,pnorm,mean(x),sd(x)) }
To check that it gives something reasonable:
sim.stat.est.1("w2",20,true.mean=10,true.sd=3)
This seems plausible.
Now we run a proper simulation:
n=20 W2=replicate(10000,sim.stat.est.1("w2",n,10,3))
Then we calculate the modified $W^2$ and compare with Table 4.2:
W2.mod=(W2-0.4/n+0.6/n^2)/(1+1/n) cv=c(0.209,0.284,0.347,0.461,0.581,0.743,0.869,1.167) fn.W2=ecdf(W2.mod) fn.W2(cv)
and
q=quantile(W2.mod,probs=lev) q
These are nowhere near the values in Table 4.2; the quantiles of the simulated distribution are all much less than the ones in Table 4.2. This makes sense, on reflection: any data will be closer to some normal distribution than to the one with the correct mean and SD, in the same way that the variability of data around the sample mean will be less than the variability of data around the correct mean.
This demonstrates that we must not use Table 4.2 to obtain a P-value for a Case 3 test.
@d1986goodness indicate that the correct table to use depends on the true distribution (in contrast to Case 0, where the same table applied to any completely-specified distribution). Their Table 4.7 provides correct P-values for this test of normality, against which we can assess our simulation:
lev=c(0.50,0.75,0.85,0.90,0.95,0.975,0.99,0.995) cv=c(0.051,0.074,0.091,0.104,0.126,0.148,0.179,0.201)
A different modification to $W^2$ is needed this time:
W2.mod=W2*(1+0.5/n) fn.W2.mod=ecdf(W2.mod) fn=fn.W2.mod(cv) rbind(lev,fn)
The simulated distribution produces P-values very close to those of Table 4.7.
q=quantile(W2.mod,probs=lev) rbind(cv,q) (q-cv)/cv*100
The simulated quantiles are very close to those from Table 4.7. Indeed, none has a relative error larger than 2\%.
It does not take long to obtain a P-value by simulation, so this is not unreasonable as a routine way to obtain a P-value. This avoids the need for tables, or for use of modifications: the distribution of the test statistic itself can be simulated, and the P-value itself is accurate to the scale of the simulation.
These are weights of 20 leghorn chicks, to be tested for normality. First, a Case 0 test, for normality with mean 200 and SD 35. We begin by reproducing the results of Table 4.2:
p.val.tab(leghorn$x,calc=pnorm,200,35)
These are consistent with @d1986goodness p.\ 104, and the P-values are all greater than 0.25.
Then we obtain P-values by simulation (the default 10,000 simulations in each case):
test0(leghorn$x,nsim=1e4,calc=pnorm,sim=rnorm,200,35)
The P-values are consistent with Table 4.2.
Now, we test these data for normality with unknown mean and SD (case 3, in other words). First, we need to write a function that returns the sample mean and SD, since these are the two parameters that pnorm
and rnorm
expect:
est=function(x) { c(mean(x),sd(x)) }
Then we can do all the tests with simulated P-values:
test3(leghorn$x,nsim=1e4,calc=pnorm,sim=rnorm,estim=est)
There is not the slightest indication of problems with normality. The test statistic values and conclusions are identical to those in @d1986goodness, p.\ 125.
The data set emea
contains heights of 530 maize plants, to be tested for normality:
data(emea) test3(emea,nsim=1e4,calc=pnorm,sim=rnorm,estim=est)
Why are those P-values missing?
est(emea)
Data chen
are lethal doses of the drug Cinobufagin:
chen
We aim to test the doses for log-normality, thus:
estln=function(x) { y=log(x) c(mean(y),sd(y)) } test3(chen,nsim=1e4,calc=plnorm,sim=rlnorm,estim=estln)
Log-normality shows no problems, but what about normality?
test3(chen,nsim=1e4,calc=pnorm,sim=rnorm,estim=est)
Normality is actually all right as well, though the P-values are smaller.
The data baen
are the differences in flood stages for two stations on the Fox River, Wisconsin, to be tested for a Laplace distribution:
baen
The maximum likelihood estimators of the two parameters of the Laplace distribution, as implemented in R, are the median and the mean absolute deviation about the median, so:
estlp=function(x) { m=median(x) dev=abs(x-m) c(m,mean(dev),1) }
and then
library(VGAM) test3(baen,nsim=1e4,calc=plaplace,sim=rlaplace,estim=estlp)
Problems with lower.tail
here, interacting with passing 3 arguments. Should I prune it back to 2?
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.