Description Usage Arguments Details Note Author(s) References See Also Examples
Model comparison of nested models using parametric bootstrap methods. Implemented for some commonly applied model types.
1 2 3 4 5 6 7 8 9 10 11 12  seqPBmodcomp(largeModel, smallModel, h = 20, nsim = 1000, cl = 1)
PBmodcomp(largeModel, smallModel, nsim = 1000, ref = NULL, seed = NULL,
cl = NULL, details = 0)
## S3 method for class 'merMod'
PBmodcomp(largeModel, smallModel, nsim = 1000, ref = NULL,
seed = NULL, cl = NULL, details = 0)
## S3 method for class 'lm'
PBmodcomp(largeModel, smallModel, nsim = 1000, ref = NULL,
seed = NULL, cl = NULL, details = 0)

largeModel 
A model object. Can be a linear mixed effects
model or generalized linear mixed effects model (as fitted with

smallModel 
A model of the same type as 
h 
For sequential computing for bootstrap pvalues: The number of extreme cases needed to generate before the sampling proces stops. 
nsim 
The number of simulations to form the reference distribution. 
cl 
A vector identifying a cluster; used for calculating the reference distribution using several cores. See examples below. 
ref 
Vector containing samples from the reference distribution. If NULL, this vector will be generated using PBrefdist(). 
seed 
A seed that will be passed to the simulation of new datasets. 
details 
The amount of output produced. Mainly relevant for debugging purposes. 
The model object
must be fitted with maximum likelihood
(i.e. with REML=FALSE
). If the object is fitted with restricted
maximum likelihood (i.e. with REML=TRUE
) then the model is
refitted with REML=FALSE
before the pvalues are calculated. Put
differently, the user needs not worry about this issue.
Under the fitted hypothesis (i.e. under the fitted small model) nsim
samples of the likelihood ratio test statistic (LRT) are generetated.
Then pvalues are calculated as follows:
LRT: Assuming that LRT has a chisquare distribution.
PBtest: The fraction of simulated LRTvalues that are larger or equal to the observed LRT value.
Bartlett: A Bartlett correction is of LRT is calculated from the mean of the simulated LRTvalues
Gamma: The reference distribution of LRT is assumed to be a gamma distribution with mean and variance determined as the sample mean and sample variance of the simulated LRTvalues.
F: The LRT divided by the number of degrees of freedom is assumed to be Fdistributed, where the denominator degrees of freedom are determined by matching the first moment of the reference distribution.
It can happen that some values of the LRT statistic in the reference distribution are negative. When this happens one will see that the number of used samples (those where the LRT is positive) are reported (this number is smaller than the requested number of samples).
In theory one can not have a negative value of the LRT statistic but in practice on can: We speculate that the reason is as follows: We simulate data under the small model and fit both the small and the large model to the simulated data. Therefore the large model represents  by definition  an overfit; the model has superfluous parameters in it. Therefore the fit of the two models will for some simulated datasets be very similar resulting in similar values of the loglikelihood. There is no guarantee that the the loglikelihood for the large model in practice always will be larger than for the small (convergence problems and other numerical issues can play a role here).
To look further into the problem, one can use the PBrefdist()
function
for simulating the reference distribution (this reference distribution can be
provided as input to PBmodcomp()
). Inspection sometimes reveals that
while many values are negative, they are numerically very small. In this case
one may try to replace the negative values by a small positive value and then
invoke PBmodcomp()
to get some idea about how strong influence there
is on the resulting pvalues. (The pvalues get smaller this way compared to
the case when only the originally positive values are used).
Søren Højsgaard [email protected]
Ulrich Halekoh, Søren Højsgaard (2014)., A KenwardRoger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models  The R Package pbkrtest., Journal of Statistical Software, 58(10), 130., http://www.jstatsoft.org/v59/i09/
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84  data(beets, package="pbkrtest")
head(beets)
## Linear mixed effects model:
sug < lmer(sugpct ~ block + sow + harvest + (1block:harvest), data=beets, REML=FALSE)
sug.h < update(sug, .~. harvest)
sug.s < update(sug, .~. sow)
anova(sug, sug.h)
PBmodcomp(sug, sug.h, nsim=50)
anova(sug, sug.h)
PBmodcomp(sug, sug.s, nsim=50)
## Linear normal model:
sug < lm(sugpct ~ block + sow + harvest, data=beets)
sug.h < update(sug, .~. harvest)
sug.s < update(sug, .~. sow)
anova(sug, sug.h)
PBmodcomp(sug, sug.h, nsim=50)
anova(sug, sug.s)
PBmodcomp(sug, sug.s, nsim=50)
## Generalized linear model
counts < c(18,17,15,20,10,20,25,13,12)
outcome < gl(3,1,9)
treatment < gl(3,3)
d.AD < data.frame(treatment, outcome, counts)
head(d.AD)
glm.D93 < glm(counts ~ outcome + treatment, family = poisson())
glm.D93.o < update(glm.D93, .~. outcome)
glm.D93.t < update(glm.D93, .~. treatment)
anova(glm.D93, glm.D93.o, test="Chisq")
PBmodcomp(glm.D93, glm.D93.o, nsim=50)
anova(glm.D93, glm.D93.t, test="Chisq")
PBmodcomp(glm.D93, glm.D93.t, nsim=50)
## Generalized linear mixed model (it takes a while to fit these)
## Not run:
(gm1 < glmer(cbind(incidence, size  incidence) ~ period + (1  herd),
data = cbpp, family = binomial))
(gm2 < update(gm1, .~.period))
anova(gm1, gm2)
PBmodcomp(gm1, gm2)
## End(Not run)
## Not run:
(fmLarge < lmer(Reaction ~ Days + (DaysSubject), sleepstudy))
## removing Days
(fmSmall < lmer(Reaction ~ 1 + (DaysSubject), sleepstudy))
anova(fmLarge, fmSmall)
PBmodcomp(fmLarge, fmSmall)
## The same test using a restriction matrix
L < cbind(0,1)
PBmodcomp(fmLarge, L)
## Vanilla
PBmodcomp(beet0, beet_no.harv, nsim=1000)
## Simulate reference distribution separately:
refdist < PBrefdist(beet0, beet_no.harv, nsim=1000)
PBmodcomp(beet0, beet_no.harv, ref=refdist)
## Do computations with multiple processors:
## Number of cores:
(nc < detectCores())
## Create clusters
cl < makeCluster(rep("localhost", nc))
## Then do:
PBmodcomp(beet0, beet_no.harv, cl=cl)
## Or in two steps:
refdist < PBrefdist(beet0, beet_no.harv, nsim=1000, cl=cl)
PBmodcomp(beet0, beet_no.harv, ref=refdist)
## It is recommended to stop the clusters before quitting R:
stopCluster(cl)
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.