mice.impute.ml.lmer: Multilevel Imputation Using 'lme4'

View source: R/mice.impute.ml.lmer.R

mice.impute.ml.lmerR Documentation

Multilevel Imputation Using lme4

Description

This function is a general imputation function based on the linear mixed effects model as implemented in lme4::lmer. The imputation model can be hierarchical or non-hierarchical and can be written in a general form \bold{y}=\bold{X} \bold{\beta} + \sum_{v=1}^V \bold{Z}_v \bold{u}_v for V multivariate random effects. While predictors can be selected by specifying the rows in the predictor matrix in mice::mice (i.e., modification of type), the level of random effects can be specified with levels_id and random slopes can be selected with random_slopes.

The function mice.impute.ml.lmer allows the imputation of variables at arbitrary levels. The corresponding level can be specified with levels_id. All predictor variables are aggregated to the corresponding level of the variable to be imputed.

Several strategies for the specification of the design matrix \bold{X} are accommodated. By default, predictors at a lower level are automatically aggregated to the higher level and included as further predictors to maintain the multilevel structure in the data (Grund, Luedtke & Robitzsch, 2018; Enders, Mistler & Keller, 2016; argument aggregate_automatically=TRUE). Further, interactions and quadratic effects can be defined by respective arguments interactions and quadratics. The dimension of the matrix of predictors can be reduced by applying partial least squares regression, see mice.impute.pls.

The function now only allows continuous data (model="continuous"), ordinal data (model="pmm") or binary data (model="pmm" or model="binary"). Nominal variables with missing values cannot (yet) be handled.

Usage

mice.impute.ml.lmer(y, ry, x, type, levels_id, variables_levels=NULL,
    random_slopes=NULL, aggregate_automatically=TRUE, intercept=TRUE,
    groupcenter.slope=FALSE, draw.fixed=TRUE, random.effects.shrinkage=1e-06,
    glmer.warnings=TRUE, model="continuous", donors=3, match_sampled_pars=FALSE,
    blme_use=FALSE, blme_args=NULL, pls.facs=0, interactions=NULL,
    quadratics=NULL, min.int.cor=0, min.all.cor=0, pls.print.progress=FALSE,
    group_index=NULL, iter_re=0, ...)

Arguments

y

Incomplete data vector of length n

ry

Vector of missing data pattern (FALSE – missing, TRUE – observed)

x

Matrix (n \times p) of complete predictors.

type

Predictor variables associated with fixed effects.

levels_id

Specification of the level identifiers (see Examples)

variables_levels

Specification of the level of variables (see Examples)

random_slopes

Specification of random slopes (see Examples)

aggregate_automatically

Logical indicating whether aggregated effects at higher levels are automatically included.

intercept

Optional logical indicating whether the intercept should be included.

groupcenter.slope

Optional logical indicating whether covariates should be centered around group means

draw.fixed

Optional logical indicating whether fixed effects parameter should be randomly drawn

random.effects.shrinkage

Shrinkage parameter for stabilizing the covariance matrix of random effects

glmer.warnings

Optional logical indicating whether warnings from glmer should be displayed

model

Type of model. Can be "continuous" for normally distributed data, "binary" for dichotomous data specifying a logistic mixed effects model and "pmm" for predictive mean matching.

donors

Number of donors used for predictive mean matching

match_sampled_pars

Logical indicating whether values of nearest neighbors should also be sampled in pmm imputation.

blme_use

Logical indicating whether the blme package should be used.

blme_args

(Prior) Arguments for blme, see blme::blmer and blme::bmerDist-class.

pls.facs

Number of factors used in PLS dimension reduction

interactions

Specification of predictors with interaction effects

quadratics

Specification of predictors with quadratic effects

min.int.cor

Minimum absolute value of correlation with outcome for interaction effects to be retained

min.all.cor

Minimum absolute value of correlation with outcome for predictors to be retained

pls.print.progress

Logical indicating whether progress of algorithm should be displayed

group_index

Optional vector for group identifiers (internally used in mice.impute.bygroup

iter_re

Number of iterations for sampling random effects in random intercept models for continuous outcomes. Using iter_re>0 is necessary for cross-classified models with not fully balanced designs.

...

Further arguments to be passed

Value

Vector of imputed values

References

Enders, C. K., Mistler, S. A., & Keller, B. T. (2016). Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. Psychological Methods, 21(2), 222-240. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1037/met0000063")}

Grund, S., Luedtke, O., & Robitzsch, A. (2018). Multiple imputation of multilevel data in organizational research. Organizational Research Methods, 21(1), 111-149. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1177/1094428117703686")}

See Also

See mice.impute.2l.continuous for two-level imputation in mice and for several links to other packages which enable multilevel imputation.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Imputation of three-level data with normally distributed residuals
#############################################################################

data(data.ma07, package="miceadds")
dat <- data.ma07

# variables at level 1 (identifier id1): x1 (some missings), x2 (complete)
# variables at level 2 (identifier id2): y1 (some missings), y2 (complete)
# variables at level 3 (identifier id3): z1 (some missings), z2 (complete)

#****************************************************************************
# Imputation model 1

#----- specify levels of variables (only relevent for variables
#      with missing values)
variables_levels <- miceadds:::mice_imputation_create_type_vector( colnames(dat), value="")
 # leave variables at lowest level blank (i.e., "")
variables_levels[ c("y1","y2") ] <- "id2"
variables_levels[ c("z1","z2") ] <- "id3"

#----- specify predictor matrix
predmat <- mice::make.predictorMatrix(data=dat)
predmat[, c("id2", "id3") ] <- 0
# set -2 for cluster identifier for level 3 variable z1
# because "2lonly" function is used
predmat[ "z1", "id3" ] <- -2

#----- specify imputation methods
method <- mice::make.method(data=dat)
method[c("x1","y1")] <- "ml.lmer"
method[c("z1")] <- "2lonly.norm"

#----- specify hierarchical structure of imputation models
levels_id <- list()
#** hierarchical structure for variable x1
levels_id[["x1"]] <- c("id2", "id3")
#** hierarchical structure for variable y1
levels_id[["y1"]] <- c("id3")

#----- specify random slopes
random_slopes <- list()
#** random slopes for variable x1
random_slopes[["x1"]] <- list( "id2"=c("x2"), "id3"=c("y1") )
# if no random slopes should be specified, the corresponding entry can be left empty
# and only a random intercept is used in the imputation model

#----- imputation in mice
imp1 <- mice::mice( dat, maxit=10, m=5, method=method,
            predictorMatrix=predmat, levels_id=levels_id,  random_slopes=random_slopes,
            variables_levels=variables_levels )
summary(imp1)

#****************************************************************************
# Imputation model 2

#----- impute x1 with predictive mean matching and y1 with normally distributed residuals
model <- list(x1="pmm", y1="continuous")

#----- assume only random intercepts
random_slopes <- NULL

#---- create interactions with z2 for all predictors in imputation models for x1 and y1
interactions <- list("x1"="z2", "y1"="z2")

#----- imputation in mice
imp2 <- mice::mice( dat, method=method, predictorMatrix=predmat,
                levels_id=levels_id, random_slopes=random_slopes,
                variables_levels=variables_levels, model=model, interactions=interactions)
summary(imp2)

## End(Not run)

miceadds documentation built on May 29, 2024, 11:05 a.m.