glmmTMB: Fit Models with TMB

Description Usage Arguments Details Note References Examples

View source: R/glmmTMB.R

Description

Fit a generalized linear mixed model (GLMM) using Template Model Builder (TMB).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
glmmTMB(
  formula,
  data = NULL,
  family = gaussian(),
  ziformula = ~0,
  dispformula = ~1,
  weights = NULL,
  offset = NULL,
  contrasts = NULL,
  na.action = na.fail,
  se = TRUE,
  verbose = FALSE,
  doFit = TRUE,
  control = glmmTMBControl(),
  REML = FALSE,
  start = NULL,
  map = NULL,
  sparseX = NULL
)

Arguments

formula

combined fixed and random effects formula, following lme4 syntax.

data

optional data frame containing model variables.

family

a family function, a character string naming a family function, or the result of a call to a family function (variance/link function) information. See family for a generic discussion of families or family_glmmTMB for details of glmmTMB-specific families.

ziformula

a one-sided (i.e., no response variable) formula for zero-inflation combining fixed and random effects: the default ~0 specifies no zero-inflation. Specifying ~. sets the zero-inflation formula identical to the right-hand side of formula (i.e., the conditional effects formula); terms can also be added or subtracted. When using ~. as the zero-inflation formula in models where the conditional effects formula contains an offset term, the offset term will automatically be dropped. The zero-inflation model uses a logit link.

dispformula

a one-sided formula for dispersion containing only fixed effects: the default ~1 specifies the standard dispersion given any family. The argument is ignored for families that do not have a dispersion parameter. For an explanation of the dispersion parameter for each family, see sigma. The dispersion model uses a log link. In Gaussian mixed models, dispformula=~0 fixes the residual variance to be 0 (actually a small non-zero value: at present it is set to sqrt(.Machine$double.eps)), forcing variance into the random effects.

weights

weights, as in glm. Not automatically scaled to have sum 1.

offset

offset for conditional model (only).

contrasts

an optional list, e.g., list(fac1="contr.sum"). See the contrasts.arg of model.matrix.default.

na.action

how to handle missing values, see na.action and model.frame. From lm: “The default is set by the na.action setting of options, and is na.fail if that is unset. The ‘factory-fresh’ default is na.omit.”

se

whether to return standard errors.

verbose

whether progress indication should be printed to the console.

doFit

whether to fit the full model, or (if FALSE) return the preprocessed data and parameter objects, without fitting the model.

control

control parameters, see glmmTMBControl.

REML

whether to use REML estimation rather than maximum likelihood.

start

starting values, expressed as a list with possible components beta, betazi, betad (fixed-effect parameters for conditional, zero-inflation, dispersion models); b, bzi (conditional modes for conditional and zero-inflation models); theta, thetazi (random-effect parameters, on the standard deviation/Cholesky scale, for conditional and z-i models); thetaf (extra family parameters, e.g., shape for Tweedie models).

map

a list specifying which parameter values should be fixed to a constant value rather than estimated. map should be a named list containing factors corresponding to a subset of the internal parameter names (see start parameter). Distinct factor values are fitted as separate parameter values, NA values are held fixed: e.g., map=list(beta=factor(c(1,2,3,NA))) would fit the first three fixed-effect parameters of the conditional model and fix the fourth parameter to its starting value. In general, users will probably want to use start to specify non-default starting values for fixed parameters. See MakeADFun for more details.

sparseX

a named logical vector containing (possibly) elements named "cond", "zi", "disp" to indicate whether fixed-effect model matrices for particular model components should be generated as sparse matrices, e.g. c(cond=TRUE). Default is all FALSE

Details

Binomial models with more than one trial (i.e., not binary/Bernoulli) can either be specified in the form prob ~ ..., weights = N, or in the more typical two-column matrix cbind(successes,failures)~... form.

Behavior of REML=TRUE for Gaussian responses matches lme4::lmer. It may also be useful in some cases with non-Gaussian responses (Millar 2011). Simulations should be done first to verify.

Because the df.residual method for glmmTMB currently counts the dispersion parameter, one would need to multiply by sqrt(nobs(fit) / (1+df.residual(fit))) when comparing with lm.

By default, vector-valued random effects are fitted with unstructured (general positive definite) variance-covariance matrices. Structured variance-covariance matrices can be specified in the form struc(terms|group), where struc is one of

Structures marked with * are experimental/untested.

For backward compatibility, the family argument can also be specified as a list comprising the name of the distribution and the link function (e.g. list(family="binomial", link="logit")). However, this alternative is now deprecated; it produces a warning and will be removed at some point in the future. Furthermore, certain capabilities such as Pearson residuals or predictions on the data scale will only be possible if components such as variance and linkfun are present, see family.

Note

For more information about the glmmTMB package, see Brooks et al. (2017) and the vignette(package="glmmTMB") collection. For the underlying TMB package that performs the model estimation, see Kristensen et al. (2016).

References

Brooks, M. E., Kristensen, K., van Benthem, K. J., Magnusson, A., Berg, C. W., Nielsen, A., Skaug, H. J., Mächler, M. and Bolker, B. M. (2017). glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. The R Journal, 9(2), 378–400.

Kristensen, K., Nielsen, A., Berg, C. W., Skaug, H. and Bell, B. (2016). TMB: Automatic differentiation and Laplace approximation. Journal of Statistical Software, 70, 1–21.

Millar, R. B. (2011). Maximum Likelihood Estimation and Inference: With Examples in R, SAS and ADMB. Wiley, New York.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
(m1 <- glmmTMB(count ~ mined + (1|site),
  zi=~mined,
  family=poisson, data=Salamanders))
summary(m1)

## Zero-inflated negative binomial model
(m2 <- glmmTMB(count ~ spp + mined + (1|site),
  zi=~spp + mined,
  family=nbinom2, data=Salamanders))

## Hurdle Poisson model
(m3 <- glmmTMB(count ~ spp + mined + (1|site),
  zi=~spp + mined,
  family=truncated_poisson, data=Salamanders))

## Binomial model
data(cbpp, package="lme4")
(bovine <- glmmTMB(cbind(incidence, size-incidence) ~ period + (1|herd),
  family=binomial, data=cbpp))

## Dispersion model
sim1 <- function(nfac=40, nt=100, facsd=0.1, tsd=0.15, mu=0, residsd=1)
{
  dat <- expand.grid(fac=factor(letters[1:nfac]), t=1:nt)
  n <- nrow(dat)
  dat$REfac <- rnorm(nfac, sd=facsd)[dat$fac]
  dat$REt <- rnorm(nt, sd=tsd)[dat$t]
  dat$x <- rnorm(n, mean=mu, sd=residsd) + dat$REfac + dat$REt
  dat
}
set.seed(101)
d1 <- sim1(mu=100, residsd=10)
d2 <- sim1(mu=200, residsd=5)
d1$sd <- "ten"
d2$sd <- "five"
dat <- rbind(d1, d2)
m0 <- glmmTMB(x ~ sd + (1|t), dispformula=~sd, data=dat)
fixef(m0)$disp
c(log(5^2), log(10^2)-log(5^2)) # expected dispersion model coefficients


## Using 'map' to fix random-effects SD to 10
m1_map <- update(m1, map=list(theta=factor(NA)),
                 start=list(theta=log(10)))
VarCorr(m1_map)

Example output

sh: 1: cannot create /dev/null: Permission denied
Formula:          count ~ mined + (1 | site)
Zero inflation:         ~mined
Data: Salamanders
      AIC       BIC    logLik  df.resid 
1908.4695 1930.8080 -949.2348       639 
Random-effects (co)variances:

Conditional model:
 Groups Name        Std.Dev.
 site   (Intercept) 0.28    

Number of obs: 644 / Conditional model: site, 23

Fixed Effects:

Conditional model:
(Intercept)      minedno  
     0.0879       1.1419  

Zero-inflation model:
(Intercept)      minedno  
      1.139       -1.736  
 Family: poisson  ( log )
Formula:          count ~ mined + (1 | site)
Zero inflation:         ~mined
Data: Salamanders

     AIC      BIC   logLik deviance df.resid 
  1908.5   1930.8   -949.2   1898.5      639 

Random effects:

Conditional model:
 Groups Name        Variance Std.Dev.
 site   (Intercept) 0.07843  0.28    
Number of obs: 644, groups:  site, 23

Conditional model:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   0.0879     0.2329   0.377    0.706    
minedno       1.1419     0.2462   4.639  3.5e-06 ***
---
Signif. codes:  0***0.001**0.01*0.05.’ 0.1 ‘ ’ 1

Zero-inflation model:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   1.1393     0.2351   4.846 1.26e-06 ***
minedno      -1.7361     0.2620  -6.626 3.44e-11 ***
---
Signif. codes:  0***0.001**0.01*0.05.’ 0.1 ‘ ’ 1
Formula:          count ~ spp + mined + (1 | site)
Zero inflation:         ~spp + mined
Data: Salamanders
      AIC       BIC    logLik  df.resid 
1670.2990 1750.7176 -817.1495       626 
Random-effects (co)variances:

Conditional model:
 Groups Name        Std.Dev.
 site   (Intercept) 0.3799  

Number of obs: 644 / Conditional model: site, 23

Overdispersion parameter for nbinom2 family (): 1.51 

Fixed Effects:

Conditional model:
(Intercept)        sppPR        sppDM      sppEC-A      sppEC-L     sppDES-L  
    -0.6104      -0.9637       0.1707      -0.3871       0.4879       0.5895  
      sppDF      minedno  
    -0.1133       1.4294  

Zero-inflation model:
(Intercept)        sppPR        sppDM      sppEC-A      sppEC-L     sppDES-L  
     0.9100       1.1614      -0.9393       1.0424      -0.5623      -0.8930  
      sppDF      minedno  
    -2.5398      -2.5630  
Formula:          count ~ spp + mined + (1 | site)
Zero inflation:         ~spp + mined
Data: Salamanders
      AIC       BIC    logLik  df.resid 
1790.1669 1866.1177 -878.0834       627 
Random-effects (co)variances:

Conditional model:
 Groups Name        Std.Dev.
 site   (Intercept) 0.2306  

Number of obs: 644 / Conditional model: site, 23

Fixed Effects:

Conditional model:
(Intercept)        sppPR        sppDM      sppEC-A      sppEC-L     sppDES-L  
   -0.06702     -0.52093      0.22458     -0.19548      0.64672      0.60514  
      sppDF      minedno  
    0.04602      1.01447  

Zero-inflation model:
(Intercept)        sppPR        sppDM      sppEC-A      sppEC-L     sppDES-L  
     1.7556       1.6785      -0.4269       1.1046      -0.4269      -0.6716  
      sppDF      minedno  
    -0.4269      -2.4038  
Formula:          cbind(incidence, size - incidence) ~ period + (1 | herd)
Data: cbpp
      AIC       BIC    logLik  df.resid 
194.05256 204.17932 -92.02628        51 
Random-effects (co)variances:

Conditional model:
 Groups Name        Std.Dev.
 herd   (Intercept) 0.6423  

Number of obs: 56 / Conditional model: herd, 15

Fixed Effects:

Conditional model:
(Intercept)      period2      period3      period4  
    -1.3985      -0.9923      -1.1287      -1.5803  
(Intercept)       sdten 
   3.202870    1.386476 
[1] 3.218876 1.386294

Conditional model:
 Groups Name        Std.Dev.
 site   (Intercept) 10      

glmmTMB documentation built on July 8, 2020, 5:06 p.m.