plausibleValues: Plausible-Values Imputation of Factor Scores Estimated from a...
In semTools: Useful Tools for Structural Equation Modeling

plausibleValues

R Documentation

Plausible-Values Imputation of Factor Scores Estimated from a lavaan Model

Description

Draw plausible values of factor scores estimated from a fitted lavaan::lavaan() model, then treat them as multiple imputations of missing data using lavaan.mi::lavaan.mi().

Usage

plausibleValues(object, nDraws = 20L, seed = 12345,
  omit.imps = c("no.conv", "no.se"), ...)

Arguments

`object`	A fitted model of class lavaan::lavaan, blavaan::blavaan, or lavaan.mi::lavaan.mi
`nDraws`	`integer` specifying the number of draws, analogous to the number of imputed data sets. If `object` is of class lavaan.mi::lavaan.mi, this will be the number of draws taken per imputation. If `object` is of class blavaan::blavaan, `nDraws` cannot exceed `blavInspect(object, "niter") * blavInspect(bfitc, "n.chains")` (number of MCMC samples from the posterior). The drawn samples will be evenly spaced (after permutation for `target="stan"`), using `ceiling()` to resolve decimals.
`seed`	`integer` passed to `set.seed()`.
`omit.imps`	`character` vector specifying criteria for omitting imputations when `object` is of class lavaan.mi::lavaan.mi. Can include any of `c("no.conv", "no.se", "no.npd")`.
`...`	Optional arguments to pass to `lavaan::lavPredict()`. `assemble` will be ignored because multiple groups are always assembled into a single `data.frame` per draw. `type` will be ignored because it is set internally to `type="lv"`.

Details

Because latent variables are unobserved, they can be considered as missing data, which can be imputed using Monte Carlo methods. This may be of interest to researchers with sample sizes too small to fit their complex structural models. Fitting a factor model as a first step, lavaan::lavPredict() provides factor-score estimates, which can be treated as observed values in a path analysis (Step 2). However, the resulting standard errors and test statistics could not be trusted because the Step-2 analysis would not take into account the uncertainty about the estimated factor scores. Using the asymptotic sampling covariance matrix of the factor scores provided by lavaan::lavPredict(), plausibleValues draws a set of nDraws imputations from the sampling distribution of each factor score, returning a list of data sets that can be treated like multiple imputations of incomplete data. If the data were already imputed to handle missing data, plausibleValues also accepts an object of class lavaan.mi::lavaan.mi, and will draw nDraws plausible values from each imputation. Step 2 would then take into account uncertainty about both missing values and factor scores. Bayesian methods can also be used to generate factor scores, as available with the blavaan package, in which case plausible values are simply saved parameters from the posterior distribution. See Asparouhov and Muthen (2010) for further technical details and references.

Each returned data.frame includes a case.idx column that indicates the corresponding rows in the data set to which the model was originally fitted (unless the user requests only Level-2 variables). This can be used to merge the plausible values with the original observed data, but users should note that including any new variables in a Step-2 model might not accurately account for their relationship(s) with factor scores because they were not accounted for in the Step-1 model from which factor scores were estimated.

If object is a multilevel lavaan model, users can request plausible values for latent variables at particular levels of analysis by setting the lavaan::lavPredict() argument level=1 or level=2. If the level argument is not passed via ..., then both levels are returned in a single merged data set per draw. For multilevel models, each returned data.frame also includes a column indicating to which cluster each row belongs (unless the user requests only Level-2 variables).

Value

A list of length nDraws, each of which is a data.frame containing plausible values, which can be treated as a list of imputed data sets to be passed to runMI() (see Examples). If object is of class lavaan.mi::lavaan.mi, the list will be of length nDraws*m, where m is the number of imputations.

Author(s)

Terrence D. Jorgensen (University of Amsterdam; TJorgensen314@gmail.com)

References

Asparouhov, T. & Muthen, B. O. (2010). Plausible values for latent variables using Mplus. Technical Report. Retrieved from www.statmodel.com/download/Plausible.pdf

Examples


## example from ?cfa and ?lavPredict help pages
HS.model <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9 '

fit1 <- cfa(HS.model, data = HolzingerSwineford1939)
fs1 <- plausibleValues(fit1, nDraws = 3,
                       ## lavPredict() can add only the modeled data
                       append.data = TRUE)
lapply(fs1, head)


## To merge factor scores to original data.frame (not just modeled data)
fs1 <- plausibleValues(fit1, nDraws = 3)
idx <- lavInspect(fit1, "case.idx")      # row index for each case
if (is.list(idx)) idx <- do.call(c, idx) # for multigroup models
data(HolzingerSwineford1939)             # copy data to workspace
HolzingerSwineford1939$case.idx <- idx   # add row index as variable
## loop over draws to merge original data with factor scores
for (i in seq_along(fs1)) {
  fs1[[i]] <- merge(fs1[[i]], HolzingerSwineford1939, by = "case.idx")
}
lapply(fs1, head)


## multiple-group analysis, in 2 steps
step1 <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
            group.equal = c("loadings","intercepts"))
PV.list <- plausibleValues(step1)

## subsequent path analysis
path.model <- ' visual ~ c(t1, t2)*textual + c(s1, s2)*speed '
if(requireNamespace("lavaan.mi")){
  library(lavaan.mi)
  step2 <- sem.mi(path.model, data = PV.list, group = "school")
  ## test equivalence of both slopes across groups
  lavTestWald.mi(step2, constraints = 't1 == t2 ; s1 == s2')
}


## multilevel example from ?Demo.twolevel help page
model <- '
  level: 1
    fw =~ y1 + y2 + y3
    fw ~ x1 + x2 + x3
  level: 2
    fb =~ y1 + y2 + y3
    fb ~ w1 + w2
'
msem <- sem(model, data = Demo.twolevel, cluster = "cluster")
mlPVs <- plausibleValues(msem, nDraws = 3) # both levels by default
lapply(mlPVs, head, n = 10)
## only Level 1
mlPV1 <- plausibleValues(msem, nDraws = 3, level = 1)
lapply(mlPV1, head)
## only Level 2
mlPV2 <- plausibleValues(msem, nDraws = 3, level = 2)
lapply(mlPV2, head)



## example with 20 multiple imputations of missing data:
nPVs <- 5
nImps <- 20

if(requireNamespace("lavaan.mi")){
  data(HS20imps, package = "lavaan.mi")

  ## specify CFA model from lavaan's ?cfa help page
  HS.model <- '
    visual  =~ x1 + x2 + x3
    textual =~ x4 + x5 + x6
    speed   =~ x7 + x8 + x9
  '
  out2 <- cfa.mi(HS.model, data = HS20imps)
  PVs <- plausibleValues(out2, nDraws = nPVs)

  idx <- out2@Data@case.idx # can't use lavInspect() on lavaan.mi
  ## empty list to hold expanded imputations
  impPVs <- list()
  for (m in 1:nImps) {
    HS20imps[[m]]["case.idx"] <- idx
    for (i in 1:nPVs) {
      impPVs[[ nPVs*(m - 1) + i ]] <- merge(HS20imps[[m]],
                                            PVs[[ nPVs*(m - 1) + i ]],
                                            by = "case.idx")
    }
  }
  lapply(impPVs, head)
}

semTools documentation built on April 3, 2025, 9:23 p.m.