# gamm: Generalized Additive Mixed Models In mgcv: Mixed GAM Computation Vehicle with Automatic Smoothness Estimation

## Description

Fits the specified generalized additive mixed model (GAMM) to data, by a call to `lme` in the normal errors identity link case, or by a call to `gammPQL` (a modification of `glmmPQL` from the `MASS` library) otherwise. In the latter case estimates are only approximately MLEs. The routine is typically slower than `gam`, and not quite as numerically robust.

To use `lme4` in place of `nlme` as the underlying fitting engine, see `gamm4` from package `gamm4`.

Smooths are specified as in a call to `gam` as part of the fixed effects model formula, but the wiggly components of the smooth are treated as random effects. The random effects structures and correlation structures available for `lme` are used to specify other random effects and correlations.

It is assumed that the random effects and correlation structures are employed primarily to model residual correlation in the data and that the prime interest is in inference about the terms in the fixed effects model formula including the smooths. For this reason the routine calculates a posterior covariance matrix for the coefficients of all the terms in the fixed effects formula, including the smooths.

To use this function effectively it helps to be quite familiar with the use of `gam` and `lme`.

## Usage

 ```1 2 3 4 5``` ```gamm(formula,random=NULL,correlation=NULL,family=gaussian(), data=list(),weights=NULL,subset=NULL,na.action,knots=NULL, control=list(niterEM=0,optimMethod="L-BFGS-B",returnObject=TRUE), niterPQL=20,verbosePQL=TRUE,method="ML",drop.unused.levels=TRUE, mustart=NULL, etastart=NULL,...) ```

## Arguments

 `formula` A GAM formula (see also `formula.gam` and `gam.models`). This is like the formula for a `glm` except that smooth terms (`s` and `te`) can be added to the right hand side of the formula. Note that `id`s for smooths and fixed smoothing parameters are not supported. `random` The (optional) random effects structure as specified in a call to `lme`: only the `list` form is allowed, to facilitate manipulation of the random effects structure within `gamm` in order to deal with smooth terms. See example below. `correlation` An optional `corStruct` object (see `corClasses`) as used to define correlation structures in `lme`. Any grouping factors in the formula for this object are assumed to be nested within any random effect grouping factors, without the need to make this explicit in the formula (this is slightly different to the behaviour of `lme`). This is a GEE approach to correlation in the generalized case. See examples below. `family` A `family` as used in a call to `glm` or `gam`. The default `gaussian` with identity link causes `gamm` to fit by a direct call to `lme` provided there is no offset term, otherwise `gammPQL` is used. `data` A data frame or list containing the model response variable and covariates required by the formula. By default the variables are taken from `environment(formula)`, typically the environment from which `gamm` is called. `weights` In the generalized case, weights with the same meaning as `glm` weights. An `lme` type weights argument may only be used in the identity link gaussian case, with no offset (see documentation for `lme` for details of how to use such an argument). `subset` an optional vector specifying a subset of observations to be used in the fitting process. `na.action` a function which indicates what should happen when the data contain ‘NA’s. The default is set by the ‘na.action’ setting of ‘options’, and is ‘na.fail’ if that is unset. The “factory-fresh” default is ‘na.omit’. `knots` this is an optional list containing user specified knot values to be used for basis construction. Different terms can use different numbers of knots, unless they share a covariate. `control` A list of fit control parameters for `lme` to replace the defaults returned by `lmeControl`. Note the setting for the number of EM iterations used by `lme`: smooths are set up using custom `pdMat` classes, which are currently not supported by the EM iteration code. If you supply a list of control values, it is advisable to include `niterEM=0`, as well, and only increase from 0 if you want to perturb the starting values used in model fitting (usually to worse values!). The `optimMethod` option is only used if your version of R does not have the `nlminb` optimizer function. `niterPQL` Maximum number of PQL iterations (if any). `verbosePQL` Should PQL report its progress as it goes along? `method` Which of `"ML"` or `"REML"` to use in the Gaussian additive mixed model case when `lme` is called directly. Ignored in the generalized case (or if the model has an offset), in which case `gammPQL` is used. `drop.unused.levels` by default unused levels are dropped from factors before fitting. For some smooths involving factor variables you might want to turn this off. Only do so if you know what you are doing. `mustart` starting values for mean if PQL used. `etastart` starting values for linear predictor if PQL used (over-rides `mustart` if supplied). `...` further arguments for passing on e.g. to `lme`

## Details

The Bayesian model of spline smoothing introduced by Wahba (1983) and Silverman (1985) opens up the possibility of estimating the degree of smoothness of terms in a generalized additive model as variances of the wiggly components of the smooth terms treated as random effects. Several authors have recognised this (see Wang 1998; Ruppert, Wand and Carroll, 2003) and in the normal errors, identity link case estimation can be performed using general linear mixed effects modelling software such as `lme`. In the generalized case only approximate inference is so far available, for example using the Penalized Quasi-Likelihood approach of Breslow and Clayton (1993) as implemented in `glmmPQL` by Venables and Ripley (2002). One advantage of this approach is that it allows correlated errors to be dealt with via random effects or the correlation structures available in the `nlme` library (using correlation structures beyond the strictly additive case amounts to using a GEE approach to fitting).

Some details of how GAMs are represented as mixed models and estimated using `lme` or `gammPQL` in `gamm` can be found in Wood (2004 ,2006a,b). In addition `gamm` obtains a posterior covariance matrix for the parameters of all the fixed effects and the smooth terms. The approach is similar to that described in Lin & Zhang (1999) - the covariance matrix of the data (or pseudodata in the generalized case) implied by the weights, correlation and random effects structure is obtained, based on the estimates of the parameters of these terms and this is used to obtain the posterior covariance matrix of the fixed and smooth effects.

The bases used to represent smooth terms are the same as those used in `gam`, although adaptive smoothing bases are not available. Prediction from the returned `gam` object is straightforward using `predict.gam`, but this will set the random effects to zero. If you want to predict with random effects set to their predicted values then you can adapt the prediction code given in the examples below.

In the event of `lme` convergence failures, consider modifying `options(mgcv.vc.logrange)`: reducing it helps to remove indefiniteness in the likelihood, if that is the problem, but too large a reduction can force over or undersmoothing. See `notExp2` for more information on this option. Failing that, you can try increasing the `niterEM` option in `control`: this will perturb the starting values used in fitting, but usually to values with lower likelihood! Note that this version of `gamm` works best with R 2.2.0 or above and `nlme`, 3.1-62 and above, since these use an improved optimizer.

## Value

Returns a list with two items:

 `gam` an object of class `gam`, less information relating to GCV/UBRE model selection. At present this contains enough information to use `predict`, `summary` and `print` methods and `vis.gam`, but not to use e.g. the `anova` method function to compare models. This is based on the working model when using `gammPQL`. `lme` the fitted model object returned by `lme` or `gammPQL`. Note that the model formulae and grouping structures may appear to be rather bizarre, because of the manner in which the GAMM is split up and the calls to `lme` and `gammPQL` are constructed.

## WARNINGS

`gamm` has a somewhat different argument list to `gam`, `gam` arguments such as `gamma` supplied to `gamm` will just be ignored.

`gamm` performs poorly with binary data, since it uses PQL. It is better to use `gam` with `s(...,bs="re")` terms, or `gamm4`.

`gamm` assumes that you know what you are doing! For example, unlike `glmmPQL` from `MASS` it will return the complete `lme` object from the working model at convergence of the PQL iteration, including the 'log likelihood', even though this is not the likelihood of the fitted GAMM.

The routine will be very slow and memory intensive if correlation structures are used for the very large groups of data. e.g. attempting to run the spatial example in the examples section with many 1000's of data is definitely not recommended: often the correlations should only apply within clusters that can be defined by a grouping factor, and provided these clusters do not get too huge then fitting is usually possible.

Models must contain at least one random effect: either a smooth with non-zero smoothing parameter, or a random effect specified in argument `random`.

`gamm` is not as numerically stable as `gam`: an `lme` call will occasionally fail. See details section for suggestions, or try the ‘gamm4’ package.

`gamm` is usually much slower than `gam`, and on some platforms you may need to increase the memory available to R in order to use it with large data sets (see `memory.limit`).

Note that the weights returned in the fitted GAM object are dummy, and not those used by the PQL iteration: this makes partial residual plots look odd.

Note that the `gam` object part of the returned object is not complete in the sense of having all the elements defined in `gamObject` and does not inherit from `glm`: hence e.g. multi-model `anova` calls will not work. It is also based on the working model when PQL is used.

The parameterization used for the smoothing parameters in `gamm`, bounds them above and below by an effective infinity and effective zero. See `notExp2` for details of how to change this.

## Author(s)

Simon N. Wood simon.wood@r-project.org

## References

Breslow, N. E. and Clayton, D. G. (1993) Approximate inference in generalized linear mixed models. Journal of the American Statistical Association 88, 9-25.

Lin, X and Zhang, D. (1999) Inference in generalized additive mixed models by using smoothing splines. JRSSB. 55(2):381-400

Pinheiro J.C. and Bates, D.M. (2000) Mixed effects Models in S and S-PLUS. Springer

Ruppert, D., Wand, M.P. and Carroll, R.J. (2003) Semiparametric Regression. Cambridge

Silverman, B.W. (1985) Some aspects of the spline smoothing approach to nonparametric regression. JRSSB 47:1-52

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.

Wahba, G. (1983) Bayesian confidence intervals for the cross validated smoothing spline. JRSSB 45:133-150

Wood, S.N. (2004) Stable and efficient multiple smoothing parameter estimation for generalized additive models. Journal of the American Statistical Association. 99:673-686

Wood, S.N. (2003) Thin plate regression splines. J.R.Statist.Soc.B 65(1):95-114

Wood, S.N. (2006a) Low rank scale invariant tensor product smooths for generalized additive mixed models. Biometrics 62(4):1025-1036

Wood S.N. (2006b) Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC Press.

Wang, Y. (1998) Mixed effects smoothing spline analysis of variance. J.R. Statist. Soc. B 60, 159-174

`magic` for an alternative for correlated data, `te`, `s`, `predict.gam`, `plot.gam`, `summary.gam`, `negbin`, `vis.gam`,`pdTens`, `gamm4` ( https://cran.r-project.org/package=gamm4)
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139``` ```library(mgcv) ## simple examples using gamm as alternative to gam set.seed(0) dat <- gamSim(1,n=200,scale=2) b <- gamm(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat) plot(b\$gam,pages=1) summary(b\$lme) # details of underlying lme fit summary(b\$gam) # gam style summary of fitted model anova(b\$gam) gam.check(b\$gam) # simple checking plots b <- gamm(y~te(x0,x1)+s(x2)+s(x3),data=dat) op <- par(mfrow=c(2,2)) plot(b\$gam) par(op) rm(dat) ## Add a factor to the linear predictor, to be modelled as random dat <- gamSim(6,n=200,scale=.2,dist="poisson") b2 <- gamm(y~s(x0)+s(x1)+s(x2),family=poisson, data=dat,random=list(fac=~1)) plot(b2\$gam,pages=1) fac <- dat\$fac rm(dat) vis.gam(b2\$gam) ## In the generalized case the 'gam' object is based on the working ## model used in the PQL fitting. Residuals for this are not ## that useful on their own as the following illustrates... gam.check(b2\$gam) ## But more useful residuals are easy to produce on a model ## by model basis. For example... fv <- exp(fitted(b2\$lme)) ## predicted values (including re) rsd <- (b2\$gam\$y - fv)/sqrt(fv) ## Pearson residuals (Poisson case) op <- par(mfrow=c(1,2)) qqnorm(rsd);plot(fv^.5,rsd) par(op) ## now an example with autocorrelated errors.... n <- 200;sig <- 2 x <- 0:(n-1)/(n-1) f <- 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10 e <- rnorm(n,0,sig) for (i in 2:n) e[i] <- 0.6*e[i-1] + e[i] y <- f + e op <- par(mfrow=c(2,2)) ## Fit model with AR1 residuals b <- gamm(y~s(x,k=20),correlation=corAR1()) plot(b\$gam);lines(x,f-mean(f),col=2) ## Raw residuals still show correlation, of course... acf(residuals(b\$gam),main="raw residual ACF") ## But standardized are now fine... acf(residuals(b\$lme,type="normalized"),main="standardized residual ACF") ## compare with model without AR component... b <- gam(y~s(x,k=20)) plot(b);lines(x,f-mean(f),col=2) ## more complicated autocorrelation example - AR errors ## only within groups defined by `fac' e <- rnorm(n,0,sig) for (i in 2:n) e[i] <- 0.6*e[i-1]*(fac[i-1]==fac[i]) + e[i] y <- f + e b <- gamm(y~s(x,k=20),correlation=corAR1(form=~1|fac)) plot(b\$gam);lines(x,f-mean(f),col=2) par(op) ## more complex situation with nested random effects and within ## group correlation set.seed(0) n.g <- 10 n<-n.g*10*4 ## simulate smooth part... dat <- gamSim(1,n=n,scale=2) f <- dat\$f ## simulate nested random effects.... fa <- as.factor(rep(1:10,rep(4*n.g,10))) ra <- rep(rnorm(10),rep(4*n.g,10)) fb <- as.factor(rep(rep(1:4,rep(n.g,4)),10)) rb <- rep(rnorm(4),rep(n.g,4)) for (i in 1:9) rb <- c(rb,rep(rnorm(4),rep(n.g,4))) ## simulate auto-correlated errors within groups e<-array(0,0) for (i in 1:40) { eg <- rnorm(n.g, 0, sig) for (j in 2:n.g) eg[j] <- eg[j-1]*0.6+ eg[j] e<-c(e,eg) } dat\$y <- f + ra + rb + e dat\$fa <- fa;dat\$fb <- fb ## fit model .... b <- gamm(y~s(x0,bs="cr")+s(x1,bs="cr")+s(x2,bs="cr")+ s(x3,bs="cr"),data=dat,random=list(fa=~1,fb=~1), correlation=corAR1()) plot(b\$gam,pages=1) summary(b\$gam) vis.gam(b\$gam) ## Prediction from gam object, optionally adding ## in random effects. ## Extract random effects and make names more convenient... refa <- ranef(b\$lme,level=5) rownames(refa) <- substr(rownames(refa),start=9,stop=20) refb <- ranef(b\$lme,level=6) rownames(refb) <- substr(rownames(refb),start=9,stop=20) ## make a prediction, with random effects zero... p0 <- predict(b\$gam,data.frame(x0=.3,x1=.6,x2=.98,x3=.77)) ## add in effect for fa = "2" and fb="2/4"... p <- p0 + refa["2",1] + refb["2/4",1] ## and a "spatial" example... library(nlme);set.seed(1);n <- 100 dat <- gamSim(2,n=n,scale=0) ## standard example attach(dat) old.par<-par(mfrow=c(2,2)) contour(truth\$x,truth\$z,truth\$f) ## true function f <- data\$f ## true expected response ## Now simulate correlated errors... cstr <- corGaus(.1,form = ~x+z) cstr <- Initialize(cstr,data.frame(x=data\$x,z=data\$z)) V <- corMatrix(cstr) ## correlation matrix for data Cv <- chol(V) e <- t(Cv) %*% rnorm(n)*0.05 # correlated errors ## next add correlated simulated errors to expected values data\$y <- f + e ## ... to produce response b<- gamm(y~s(x,z,k=50),correlation=corGaus(.1,form=~x+z), data=data) plot(b\$gam) # gamm fit accounting for correlation # overfits when correlation ignored..... b1 <- gamm(y~s(x,z,k=50),data=data);plot(b1\$gam) b2 <- gam(y~s(x,z,k=50),data=data);plot(b2) par(old.par) ```