# Methods for MixMod Objects" In GLMMadaptive: Generalized Linear Mixed Models using Adaptive Gaussian Quadrature

```knitr::opts_chunk\$set(
collapse = TRUE,
comment = "#>"
)
library("GLMMadaptive")
```

# MixMod Objects

In this vignette we illustrate the use of a number of basic generic functions for models fitted by the `mixed_model()` function of package GLMMadaptive.

We start by simulating some data for a binary longitudinal outcome:

```set.seed(1234)
n <- 100 # number of subjects
K <- 8 # number of measurements per subject
t_max <- 15 # maximum follow-up time

# we constuct a data frame with the design:
# everyone has a baseline measurment, and then measurements at random follow-up times
DF <- data.frame(id = rep(seq_len(n), each = K),
time = c(replicate(n, c(0, sort(runif(K - 1, 0, t_max))))),
sex = rep(gl(2, n/2, labels = c("male", "female")), each = K))

# design matrices for the fixed and random effects
X <- model.matrix(~ sex * time, data = DF)
Z <- model.matrix(~ time, data = DF)

betas <- c(-2.13, -0.25, 0.24, -0.05) # fixed effects coefficients
D11 <- 0.48 # variance of random intercepts
D22 <- 0.1 # variance of random slopes

# we simulate random effects
b <- cbind(rnorm(n, sd = sqrt(D11)), rnorm(n, sd = sqrt(D22)))
# linear predictor
eta_y <- as.vector(X %*% betas + rowSums(Z * b[DF\$id, ]))
# we simulate binary longitudinal data
DF\$y <- rbinom(n * K, 1, plogis(eta_y))
```

We continue by fitting the mixed effects logistic regression for `y` assuming random intercepts and random slopes for the random-effects part.

```fm <- mixed_model(fixed = y ~ sex * time, random = ~ time | id, data = DF,
family = binomial())
```

## Detailed Output, Confidence Intervals, Covariance Matrix of the MLEs and Robust Standard Errors

As in the majority of model-fitting functions in R, the `print()` and `summary()` methods display a short and a detailed output of the fitted model, respectively. For `'MixMod'` objects we obtain

```fm
```

and

```summary(fm)
```

The output is rather self-explanatory. However, just note that the fixed-effects coefficients are on the linear predictor scale, and hence are the corresponding log-odds for the intercept and log-odds ratios for the rest of the parameters. The `summary()` only shows the estimated coefficients, standard errors and p-values, but no confidence intervals. These can be separately obtained using the `confint()` method, i.e.,

```exp(confint(fm))
```

By default the confidence intervals are produced for the fixed effects. Hence, taking the exp we obtain the confidence intervals for the corresponding odds-ratios. In addition, by default, the level of the confidence intervals is 95%. The following piece of code produces 90% confidence intervals for the variances of the random intercepts and slopes, and for their covariance:

```confint(fm, parm = "var-cov", level = 0.90)
```

The estimated variance-covariance matrix of the maximum likelihood estimates of all parameters is returned using the `vcov()` method, e.g.,

```vcov(fm)
```

The elements of this covariance matrix that correspond to the elements of the covariance matrix of the random effects (i.e., the elements `D_xx`) are on the log-Cholesky scale.

Robust standard errors using the sandwich estimator can be obtained by setting the logical argument `sandwich` to `TRUE`, i.e.,

```vcov(fm, sandwich = TRUE)
```

The use of robust standard errors via the `sandwich` argument is also available in the `summary()`, `confint()`, `anova()`, `marginal_coefs()`, `effectPlotData()`, `predict()`, and `simulate()` methods.

## Fixed and Random Effects Estimates

To extract the estimated fixed effects coefficients from a fitted mixed model, we can use the `fixef()` method. Similarly, the empirical Bayes estimates of the random effects are extracted using the `ranef()` method, and finally the `coef()` method returns the subject-specific coefficients, i.e., the sum of the fixed and random effects coefficients:

```fixef(fm)
```
```head(ranef(fm))
```
```head(coef(fm))
```

## Marginalized Coefficients

The fixed effects estimates in mixed models with nonlinear link functions have an interpretation conditional on the random effects. However, often we wish to obtain parameters with a marginal / population averaged interpretation, which leads many researchers to use generalized estimating equations, and dealing with potential issues with missing data. Nonetheless, recently Hedeker et al. have proposed a nice solution to this problem. Their approach is implemented in function `marginal_coefs()`. For example, for model `fm` we obtain the marginalized coefficients using:

```marginal_coefs(fm)
```

The function calculates the marginal log odds ratios in our case (because we have a binary outcome) using a Monte Carlo procedure with number of samples determined by the `M` argument.

Standard errors for the marginalized coefficients are obtained by setting `std_errors = TRUE` in the call to `marginal_coefs()`, and require a double Monte Carlo procedure for which argument `K` comes also into play. To speed up computations, the outer Monte Carlo procedure is performed in parallel using package parallel and number of cores specified in the `cores` argument (results not shown):

```marginal_coefs(fm, std_errors = TRUE, cores = 5)
```

## Fitted Values and Residuals

The `fitted()` method extracts fitted values from the fitted mixed model. These are always on the scale of the response variable. The `type` argument of `fitted()` specifies the type of fitted values computed. The default is `type = "mean_subject"` which corresponds to the fitted values calculated using only the fixed-effects part of the linear predictor; hence, for the subject who has random effects values equal to 0, i.e., the "mean subject":

```head(fitted(fm))
```

Setting `type = "subject_specific"` will calculate the fitted values using both the fixed and random effects parts, where for the latter the empirical Bayes estimates of the random effects are used:

```head(fitted(fm, type = "subject_specific"))
```

Finally, setting `type = "marginal"` will calculate the fitted values based on the multiplication of the fixed-effects design matrix with the marginalized coefficients described above (due to the required computing time, these fitted values are not displayed):

```head(fitted(fm, type = "marginal"))
```

The `residuals()` method simply calculates the residuals by subtracting the fitted values from the observed repeated measurements outcome. Hence, this method also has a `type` argument with exactly the same options as the `fitted()` method.

## Effect Plots

To display the estimated longitudinal evolutions of the binary outcome we can use an effect plot. This is simply predictions from the models with the corresponding 95% pointwise confidence intervals.

As a first step we create a data frame the provides the setting based on which the plot is to be produced; function `expand.grid()` is helpful in this regard:

```nDF <- with(DF, expand.grid(time = seq(min(time), max(time), length.out = 15),
sex = levels(sex)))
```

Next we use the `effectPlotData()` function that does the heavy lifting, i.e., calculates the predictions and confidence intervals from a fitted mixed model for the data frame provided above, i.e.,

```plot_data <- effectPlotData(fm, nDF)
```

Then we can produce the plot using for example the `xyplot()` function from package lattice, e.g.,

```library("lattice")
xyplot(pred + low + upp ~ time | sex, data = plot_data,
type = "l", lty = c(1, 2, 2), col = c(2, 1, 1), lwd = 2,
xlab = "Follow-up time", ylab = "log odds")

expit <- function (x) exp(x) / (1 + exp(x))
xyplot(expit(pred) + expit(low) + expit(upp) ~ time | sex, data = plot_data,
type = "l", lty = c(1, 2, 2), col = c(2, 1, 1), lwd = 2,
xlab = "Follow-up time", ylab = "Subject-Specific Probabilities")
```

The `effectPlotData()` function also allows to compute marginal predictions using the marginalized coefficients described above. This is achieved by setting `marginal = TRUE` in the respective call (results not shown):

```plot_data_m <- effectPlotData(fm, nDF, marginal = TRUE)

# we put the two groups in the same panel
my.panel.bands <- function(x, y, upper, lower, fill, col, subscripts, ..., font,
fontface) {
upper <- upper[subscripts]
lower <- lower[subscripts]
panel.polygon(c(x, rev(x)), c(upper, rev(lower)), col = fill, border = FALSE, ...)
}

xyplot(expit(pred) ~ time, group = sex, data = plot_data_m,
upper = expit(plot_data_m\$upp), low = expit(plot_data_m\$low),
type = "l", col = c("blue", "red"),
fill = c("#0000FF80", "#FF000080"),
panel = function (x, y, ...) {
panel.superpose(x, y, panel.groups = my.panel.bands, ...)
panel.xyplot(x, y, lwd = 2,  ...)
}, xlab = "Follow-up time", ylab = "Marginal Probabilities")
```

## Effect Plots using the effects package

In addition to using the `effectPlotData()` function, the same type of effect plots can be produced by the effects package. For example, based on the `fm` model we produce the effect plot for `time` and for the two groups using the call to `predictorEffect()` and its `plot()` method:

```library("effects")
plot(predictorEffect("time", fm), type = "link")
```

The `type` argument controls the scale of the y-axis, namely, `type = "link"` corresponds to the log-odds scale. If we would like to obtain the figure in the probability scale, we should set `type = "response"`, i.e.,

```plot(predictorEffect("time", fm), type = "response")
```

For the additional functionality provided by the effects package, check the vignette: `vignette("predictor-effects-gallery", package = "effects")`.

Note: Effects plots via the effects package are currently only supported for the `binomial()` and `poisson()` families.

## Comparing Two Models

The `anova()` method can be used to compare two fitted mixed models using a likelihood ratio test. For example, we test if we can test the null hypothesis that the covariance between the random intercepts and slopes is equal to zero using:

```gm <- mixed_model(fixed = y ~ sex * time, random = ~ time || id, data = DF,
family = binomial())

anova(gm, fm)
#>
#>       AIC    BIC log.Lik  LRT df p.value
#> gm 730.94 746.57 -359.47
#> fm 731.66 731.66 -358.83 1.29  1  0.2564
```

## Predictions

Using the `predict()` method we can calculate predictions for new subjects. As an example, we treat subject 1 from the `DF` dataset as a new patient (in the code below we change his `id` variable):

```pred_DF <- DF[DF\$id == 1, ][1:4, ]
pred_DF\$id <- paste0("N", as.character(pred_DF\$id))
pred_DF
```

We start by computing predictions based only on the fixed-effects part of the model; because of the nonlinear link function, these predictions are of subjects with random effects value equal to zero, which is not to the average predictions:

```predict(fm, newdata = pred_DF, type_pred = "response",
type = "mean_subject", se.fit = TRUE)
```

Population averaged predictions can be obtained by first calculating the marginalized coefficients (using `marginal_coefs()`) and multiplying them with the fixed effects design matrix; this is achieved using the option `type = "marginal"` (due to the required computing time, predictions not shown):

```predict(fm, newdata = pred_DF, type_pred = "response",
type = "marginal", se.fit = FALSE)
```

Finally, we calculate subject-specific predictions; the standard errors are calculated with a Monte Carlo scheme (for details check the online help file):

```predict(fm, newdata = pred_DF, type_pred = "response",
type = "subject_specific", se.fit = TRUE)
```

Suppose now that we want predictions at time points at which no responses `y` have been recorded, e.g.,

```future_Times <- pred_DF[1:3, c("id", "time", "sex")]
future_Times\$time <- c(3, 4, 10)
future_Times
```

Predictions at these time points can be calculated by provide this data frame in the `newdata2` argument of `predict()`:

```predict(fm, newdata = pred_DF, newdata2 = future_Times, type_pred = "response",
type = "subject_specific", se.fit = TRUE)
```

## Simulate

The `simulate()` method can be used to simulate response outcome data from a fitted mixed model. For example, we simulate two realization of our dichotomous outcome:

```head(simulate(fm, nsim = 2, seed = 123), 10)
```

By setting `acount_MLEs_var = TRUE` in the call to `simulate()` we also account for the variability in the maximum likelihood estimates in the simulation of new responses. This is achieved by simulating each time a new response vector using a realization for the parameters from a multivariate normal distribution with mean the MLEs and covariance matrix the covariance matrix of the MLEs:

```head(simulate(fm, nsim = 2, seed = 123, acount_MLEs_var = TRUE), 10)
```

## Try the GLMMadaptive package in your browser

Any scripts or data that you put into this service are public.

GLMMadaptive documentation built on May 2, 2019, 2:51 p.m.