Nothing
### Terrence D. Jorgensen & Yves Rosseel
### Last updated: 30 May 2024
### Pooled Wald test for multiple imputations
### Borrowed source code from lavaan/R/lav_test_Wald.R
##' Wald Test for Multiple Imputations
##'
##' Wald test for testing a linear hypothesis about the parameters of lavaan
##' models fitted to multiple imputed data sets. Statistics for constraining
##' one or more free parameters in a model can be calculated from the pooled
##' point estimates and asymptotic covariance matrix of model parameters
##' using Rubin's (1987) rules, or by pooling the Wald test statistics
##' across imputed data sets (Li, Meng, Raghunathan, & Rubin, 1991).
##'
##' The constraints are specified using the `"=="` operator.
##' Both the left-hand side and the right-hand side of the equality can contain
##' a linear combination of model parameters, or a constant (like zero).
##' The model parameters must be specified by their user-specified labels from
##' the \code{link[lavaan]{model.syntax}}. Names of defined parameters
##' (using the ":=" operator) can be included too.
##'
##' @aliases lavTestWald.mi
##' @importFrom lavaan parTable lavListInspect
##' @importFrom stats pchisq pf
##'
##' @param object An object of class [lavaan.mi-class].
##' @param constraints A `character` string (typically between single
##' quotes) containing one or more equality constraints.
##' See examples for more details
##' @param pool.method `character` indicating which pooling method to use.
##' `"D1"` or `"Rubin"` (default) indicates Rubin's (1987) rules
##' will be applied to the point estimates and the asymptotic covariance
##' matrix of model parameters, and those pooled values will be used to
##' calculate the Wald test in the usual manner. `"D2"`, `"LMRR"`,
##' or `"Li.et.al"` indicate that the complete-data Wald test statistic
##' should be calculated using each imputed data set, which will then be
##' pooled across imputations, as described in Li, Meng, Raghunathan, & Rubin
##' (1991) and Enders (2010, chapter 8).
##' @param asymptotic `logical`. If `FALSE` (default), the pooled test
##' will be returned as an *F*-distributed statistic with numerator
##' (`df1`) and denominator (`df2`) degrees of freedom.
##' If `TRUE`, the pooled *F* statistic will be multiplied by its
##' `df1` on the assumption that its `df2` is sufficiently large
##' enough that the statistic will be asymptotically \eqn{\chi^2} distributed
##' with `df1`.
##' @param scale.W `logical`. If `FALSE`, the pooled
##' asymptotic covariance matrix of model parameters is calculated as the
##' weighted sum of the within-imputation and between-imputation components.
##' Otherwise, the pooled asymptotic covariance matrix of model parameters is
##' calculated by scaling the within-imputation component by the
##' average relative increase in variance (ARIV; see Enders, 2010, p. 235),
##' which is *only* consistent when requesting the *F* test (i.e.,
##' `asymptotic = FALSE`. Ignored (irrelevant) if `pool.method = "D2"`.
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. Specific imputation numbers can also be included in this
##' argument, in case users want to apply their own custom omission criteria
##' (or simulations can use different numbers of imputations without
##' redundantly refitting the model).
##' @param verbose `logical`. If `TRUE`, print the restriction
##' matrix and the estimated restricted values.
##' @param warn `logical`. If `TRUE`, print warnings if they occur.
##'
##' @return
##' A vector containing the Wald test statistic (either an `F` or
##' \eqn{\chi^2} statistic, depending on the `asymptotic` argument),
##' the degrees of freedom (numerator and denominator, if
##' `asymptotic = FALSE`), and a *p* value. If
##' `asymptotic = FALSE`, the relative invrease in variance (RIV, or
##' average for multiparameter tests: ARIV) used to calculate the denominator
##' *df* is also returned as a missing-data diagnostic, along with the
##' fraction missing information (FMI = ARIV / (1 + ARIV)).
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' Based on source code for [lavaan::lavTestWald()] by Yves Rosseel
##'
##' @references
##' Enders, C. K. (2010). *Applied missing data analysis*.
##' New York, NY: Guilford.
##'
##' Li, K.-H., Meng, X.-L., Raghunathan, T. E., & Rubin, D. B. (1991).
##' Significance levels from repeated *p*-values with multiply-imputed
##' data. *Statistica Sinica, 1*(1), 65--92. Retrieved from
##' <https://www.jstor.org/stable/24303994>
##'
##' Rubin, D. B. (1987). *Multiple imputation for nonresponse in surveys*.
##' New York, NY: Wiley. \doi{10.1002/9780470316696}
##'
##' @seealso [lavaan::lavTestWald()]
##'
##' @examples
##' data(HS20imps) # import a list of 20 imputed data sets
##'
##' ## specify CFA model from lavaan's ?cfa help page
##' HS.model <- '
##' visual =~ x1 + b1*x2 + x3
##' textual =~ x4 + b2*x5 + x6
##' speed =~ x7 + b3*x8 + x9
##' '
##'
##' fit <- cfa.mi(HS.model, data = HS20imps)
##'
##' ## Testing whether a single parameter equals zero yields the 'chi-square'
##' ## version of the Wald z statistic from the summary() output, or the
##' ## 'F' version of the t statistic from the summary() output, depending
##' ## whether asymptotic = TRUE or FALSE
##' lavTestWald.mi(fit, constraints = "b1 == 0") # default D1 statistic
##' lavTestWald.mi(fit, constraints = "b1 == 0", pool.method = "D2") # D2 statistic
##'
##' ## The real advantage is simultaneously testing several equality
##' ## constraints, or testing more complex constraints:
##' con <- '
##' 2*b1 == b3
##' b2 - b3 == 0
##' '
##' lavTestWald.mi(fit, constraints = con) # default F statistic
##' lavTestWald.mi(fit, constraints = con, asymptotic = TRUE) # chi-squared
##'
##'
##' @export
lavTestWald.mi <- function(object, constraints = NULL, pool.method = c("D1","D2"),
asymptotic = FALSE, scale.W = !asymptotic,
omit.imps = c("no.conv","no.se"),
verbose = FALSE, warn = TRUE) {
## this also checks the class:
useImps <- imps2use(object = object, omit.imps = omit.imps)
m <- length(useImps)
pool.method <- tolower(pool.method[1])
if (pool.method %in% c("d2", "lmrr", "li.et.al")) pool.method <- "D2"
if (pool.method %in% c("d1", "rubin")) pool.method <- "D1"
if (!pool.method %in% c("D1","D2")) stop('Invalid choice of "pool.method" argument.')
message('\nWald test calculated using se = "',
lavListInspect(object, "options")$se, '"\n')
if (pool.method == "D2") {
oldCall <- object@lavListCall
if (!is.null(oldCall$parallel)) {
if (oldCall$parallel == "snow") {
oldCall$parallel <- "no"
oldCall$ncpus <- 1L
if (warn) warning("Unable to pass lavaan::lavTestWald() arguments ",
"when parallel='snow'. Switching to parallel='no'.",
" Unless using Windows, parallel='multicore' works.")
}
}
## call lavaanList() again to run lavTestWald() on each imputation
oldCall$FUN <- function(obj) {
out <- try(lavaan::lavTestWald(object = obj, constraints = constraints,
verbose = FALSE), silent = TRUE)
if (inherits(out, "try-error")) return(NULL)
do.call(c, out[1:2])
}
FIT <- eval(as.call(oldCall))
## check if there are any results
noStats <- sapply(FIT@funList, is.null)
if (all(noStats)) stop("No success using lavTestWald() on any imputations.",
call. = FALSE)
## template to fill in pooled values
## at a minimum, pool the total Wald pool.method
chiList <- sapply(FIT@funList[ intersect(useImps, which(!noStats)) ],
"[[", i = 1)
DF <- FIT@funList[[ intersect(useImps, which(!noStats))[1] ]][[2]]
out <- calculate.D2(chiList, DF = DF, asymptotic)
class(out) <- c("lavaan.vector","numeric")
## add header
attr(out, "header") <- paste("Wald statistic pooled using the",
pool.method, "pooling method")
return(out)
} # else pool.method == "D1", making 'scale.W=' relevant
## "borrowed" lavTestWald()
if (is.null(constraints) || nchar(constraints) == 0L) stop("constraints are empty")
# remove == constraints from parTable, save as list
PT <- parTable(object)
partable <- as.list(PT[PT$op != "==", ])
# parse constraints
FLAT <- lavaan::lavParseModelString( constraints )
CON <- attr(FLAT, "constraints")
LIST <- list()
if (length(CON) > 0L) {
lhs <- unlist(lapply(CON, "[[", i = "lhs"))
op <- unlist(lapply(CON, "[[", i = "op"))
rhs <- unlist(lapply(CON, "[[", i = "rhs"))
LIST$lhs <- c(LIST$lhs, lhs) # FIXME: why concatenate with NULL?
LIST$op <- c(LIST$op, op)
LIST$rhs <- c(LIST$rhs, rhs)
} else stop("no equality constraints found in constraints argument")
# theta = free parameters only (equality-constrained allowed)
theta <- coef_lavaan_mi(object, omit.imps = omit.imps) #object@optim$x
# build constraint function
ceq.function <- lavaan::lav_partable_constraints_ceq(partable = partable,
con = LIST, debug = FALSE)
# compute jacobian restrictions
JAC <- try(lavaan::lav_func_jacobian_complex(func = ceq.function, x = theta),
silent = TRUE)
if (inherits(JAC, "try-error")) { # eg. pnorm()
JAC <- lavaan::lav_func_jacobian_simple(func = ceq.function, x = theta)
}
if (verbose) {cat("Restriction matrix (jacobian):\n"); print(JAC); cat("\n")}
# linear restriction
theta.r <- ceq.function( theta )
if (verbose) {cat("Restricted theta values:\n"); print(theta.r); cat("\n")}
# get VCOV
VCOV <- vcov_lavaan_mi(object, scale.W = scale.W, omit.imps = omit.imps)
# restricted vcov
info.r <- JAC %*% VCOV %*% t(JAC)
# Wald test statistic
test.stat <- as.numeric(t(theta.r) %*% solve( info.r ) %*% theta.r)
# number of constraints (k in Enders (2010, p. 235) eqs. 8.23-25)
DF <- nrow(JAC)
if (asymptotic) {
out <- c("chisq" = test.stat, df = DF,
pvalue = pchisq(test.stat, df = DF, lower.tail = FALSE))
} else {
W <- vcov_lavaan_mi(object, type = "within" , omit.imps = omit.imps)
B <- vcov_lavaan_mi(object, type = "between", omit.imps = omit.imps)
#FIXME: only valid for linear constraints?
## restricted B & W components of VCOV
W.r <- JAC %*% W %*% t(JAC)
B.r <- JAC %*% B %*% t(JAC)
## relative increase in variance due to missing data
W.inv <- MASS::ginv(W.r)
ariv <- (1 + 1/m) * sum(diag(B.r %*% W.inv)) / DF
## calculate denominator DF for F statistic
a <- DF*(m - 1)
if (a > 4) {
v2 <- 4 + (a - 4) * (1 + (1 - 2/a)*(1 / ariv))^2 # Enders (eq. 8.24)
} else {
v2 <- a*(1 + 1/DF) * (1 + 1/ariv)^2 / 2 # Enders (eq. 8.25)
}
out <- c("F" = test.stat / DF, df1 = DF, df2 = v2,
pvalue = pf(test.stat / DF, df1 = DF, df2 = v2, lower.tail = FALSE),
ariv = ariv, fmi = ariv / (1 + ariv))
}
class(out) <- c("lavaan.vector","numeric")
## add header
attr(out, "header") <- paste("Wald statistic pooled using the",
pool.method, "pooling method")
out
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.