measEq.syntax: Syntax for measurement equivalence

View source: R/measEq.R

measEq.syntaxR Documentation

Syntax for measurement equivalence

Description

Automatically generates lavaan model syntax to specify a confirmatory factor analysis (CFA) model with equality constraints imposed on user-specified measurement (or structural) parameters. Optionally returns the fitted model (if data are provided) representing some chosen level of measurement equivalence/invariance across groups and/or repeated measures.

Usage

measEq.syntax(configural.model, ..., ID.fac = "std.lv",
  ID.cat = "Wu.Estabrook.2016", ID.thr = c(1L, 2L), group = NULL,
  group.equal = "", group.partial = "", longFacNames = list(),
  longIndNames = list(), long.equal = "", long.partial = "",
  auto = "all", warn = TRUE, debug = FALSE, return.fit = FALSE)

Arguments

configural.model

A model with no measurement-invariance constraints (i.e., representing only configural invariance), unless required for model identification. configural.model can be either:

  • lavaan model.syntax or a parameter table (as returned by parTable) specifying the configural model. Using this option, the user can also provide either raw data or summary statistics via sample.cov and (optionally) sample.mean. See argument descriptions in lavaan. In order to include thresholds in the generated syntax, either users must provide raw data, or the configural.model syntax must specify all thresholds (see first example). If raw data are not provided, the number of blocks (groups, levels, or combination) must be indicated using an arbitrary sample.nobs argument (e.g., 3 groups could be specified using sample.nobs=rep(1, 3)).

  • a fitted lavaan model (e.g., as returned by cfa) estimating the configural model

Note that the specified or fitted model must not contain any latent structural parameters (i.e., it must be a CFA model), unless they are higher-order constructs with latent indicators (i.e., a second-order CFA).

...

Additional arguments (e.g., data, ordered, or parameterization) passed to the lavaan function. See also lavOptions.

ID.fac

character. The method for identifying common-factor variances and (if meanstructure = TRUE) means. Three methods are available, which go by different names in the literature:

  • Standardize the common factor (mean = 0, SD = 1) by specifying any of: "std.lv", "unit.variance", "UV", "fixed.factor", "fixed-factor"

  • Choose a reference indicator by specifying any of: "auto.fix.first", "unit.loading", "UL", "marker", "ref", "ref.indicator", "reference.indicator", "reference-indicator", "marker.variable", "marker-variable"

  • Apply effects-code constraints to loadings and intercepts by specifying any of: "FX", "EC", "effects", "effects.coding", "effects-coding", "effects.code", "effects-code"

See Kloessner & Klopp (2019) for details about all three methods.

ID.cat

character. The method for identifying (residual) variances and intercepts of latent item-responses underlying any ordered indicators. Four methods are available:

  • To follow Wu & Estabrook's (2016) guidelines (default), specify any of: "Wu.Estabrook.2016", "Wu.2016", "Wu.Estabrook", "Wu", "Wu2016". For consistency, specify ID.fac = "std.lv".

  • To use the default settings of Mplus and lavaan, specify any of: "default", "Mplus", "Muthen". Details provided in Millsap & Tein (2004).

  • To use the constraints recommended by Millsap & Tein (2004; see also Liu et al., 2017, for the longitudinal case) specify any of: "millsap", "millsap.2004", "millsap.tein.2004". For consistency, specify ID.fac = "marker" and parameterization = "theta".

  • To use the default settings of LISREL, specify "LISREL" or "Joreskog". Details provided in Millsap & Tein (2004). For consistency, specify parameterization = "theta".

See Details and References for more information.

ID.thr

integer. Only relevant when ID.cat = "Millsap.Tein.2004". Used to indicate which thresholds should be constrained for identification. The first integer indicates the threshold used for all indicators, the second integer indicates the additional threshold constrained for a reference indicator (ignored if binary).

group

optional character indicating the name of a grouping variable. See cfa.

group.equal

optional character vector indicating type(s) of parameter to equate across groups. Ignored if is.null(group). See lavOptions.

group.partial

optional character vector or a parameter table indicating exceptions to group.equal (see lavOptions). Any variables not appearing in the configural.model will be ignored, and any parameter constraints needed for identification (e.g., two thresholds per indicator when ID.cat = "Millsap") will be removed.

longFacNames

optional named list of character vectors, each indicating multiple factors in the model that are actually the same construct measured repeatedly. See Details and Examples.

longIndNames

optional named list of character vectors, each indicating multiple indicators in the model that are actually the same indicator measured repeatedly. See Details and Examples.

long.equal

optional character vector indicating type(s) of parameter to equate across repeated measures. Ignored if no factors are indicated as repeatedly measured in longFacNames.

long.partial

optional character vector or a parameter table indicating exceptions to long.equal. Any longitudinal variable names not appearing in names(longFacNames) or names(longIndNames) will be ignored, and any parameter constraints needed for identification will be removed.

auto

Used to automatically included autocorrelated measurement errors among repeatedly measured indicators in longIndNames. Specify a single integer to set the maximum order (e.g., auto = 1L indicates that an indicator's unique factors should only be correlated between adjacently measured occasions). auto = TRUE or "all" will specify residual covariances among all possible lags per repeatedly measured indicator in longIndNames.

warn, debug

logical. Passed to lavaan and lavParseModelString. See lavOptions.

return.fit

logical indicating whether the generated syntax should be fitted to the provided data (or summary statistics, if provided via sample.cov). If configural.model is a fitted lavaan model, the generated syntax will be fitted using the update method (see lavaan), and ... will be passed to lavaan. If neither data nor a fitted lavaan model were provided, this must be FALSE. If TRUE, the generated measEq.syntax object will be included in the lavaan object's @external slot, accessible by fit@external$measEq.syntax.

Details

This function is a pedagogical and analytical tool to generate model syntax representing some level of measurement equivalence/invariance across any combination of multiple groups and/or repeated measures. Support is provided for confirmatory factor analysis (CFA) models with simple or complex structure (i.e., cross-loadings and correlated residuals are allowed). For any complexities that exceed the limits of automation, this function is intended to still be useful by providing a means to generate syntax that users can easily edit to accommodate their unique situations.

Limited support is provided for bifactor models and higher-order constructs. Because bifactor models have cross-loadings by definition, the option ID.fac = "effects.code" is unavailable. ID.fac = "UV" is recommended for bifactor models, but ID.fac = "UL" is available on the condition that each factor has a unique first indicator in the configural.model. In order to maintain generality, higher-order factors may include a mix of manifest and latent indicators, but they must therefore require ID.fac = "UL" to avoid complications with differentiating lower-order vs. higher-order (or mixed-level) factors. The keyword "loadings" in group.equal or long.equal constrains factor loadings of all manifest indicators (including loadings on higher-order factors that also have latent indicators), whereas the keyword "regressions" constrains factor loadings of latent indicators. Users can edit the model syntax manually to adjust constraints as necessary, or clever use of the group.partial or long.partial arguments could make it possible for users to still automated their model syntax. The keyword "intercepts" constrains the intercepts of all manifest indicators, and the keyword "means" constrains intercepts and means of all latent common factors, regardless of whether they are latent indicators of higher-order factors. To test equivalence of lower-order and higher-order intercepts/means in separate steps, the user can either manually edit their generated syntax or conscientiously exploit the group.partial or long.partial arguments as necessary.

ID.fac: If the configural.model fixes any (e.g., the first) factor loadings, the generated syntax object will retain those fixed values. This allows the user to retain additional constraints that might be necessary (e.g., if there are only 1 or 2 indicators). Some methods must be used in conjunction with other settings:

  • ID.cat = "Millsap" requires ID.fac = "UL" and parameterization = "theta".

  • ID.cat = "LISREL" requires parameterization = "theta".

  • ID.fac = "effects.code" is unavailable when there are any cross-loadings.

ID.cat: Wu & Estabrook (2016) recommended constraining thresholds to equality first, and doing so should allow releasing any identification constraints no longer needed. For each ordered indicator, constraining one threshold to equality will allow the item's intercepts to be estimated in all but the first group or repeated measure. Constraining a second threshold (if applicable) will allow the item's (residual) variance to be estimated in all but the first group or repeated measure. For binary data, there is no independent test of threshold, intercept, or residual-variance equality. Equivalence of thresholds must also be assumed for three-category indicators. These guidelines provide the least restrictive assumptions and tests, and are therefore the default.

The default setting in Mplus is similar to Wu & Estabrook (2016), except that intercepts are always constrained to zero (so they are assumed to be invariant without testing them). Millsap & Tein (2004) recommended parameterization = "theta" and identified an item's residual variance in all but the first group (or occasion; Liu et al., 2017) by constraining its intercept to zero and one of its thresholds to equality. A second threshold for the reference indicator (so ID.fac = "UL") is used to identify the common-factor means in all but the first group/occasion. The LISREL software fixes the first threshold to zero and (if applicable) the second threshold to 1, and assumes any remaining thresholds to be equal across groups / repeated measures; thus, the intercepts are always identified, and residual variances (parameterization = "theta") are identified except for binary data, when they are all fixed to one.

Repeated Measures: If each repeatedly measured factor is measured by the same indicators (specified in the same order in the configural.model) on each occasion, without any cross-loadings, the user can let longIndNames be automatically generated. Generic names for the repeatedly measured indicators are created using the name of the repeatedly measured factors (i.e., names(longFacNames)) and the number of indicators. So the repeatedly measured first indicator ("ind") of a longitudinal construct called "factor" would be generated as "._factor_ind.1".

The same types of parameter can be specified for long.equal as for group.equal (see lavOptions for a list), except for "residual.covariances" or "lv.covariances". Instead, users can constrain autocovariances using keywords "resid.autocov" or "lv.autocov". Note that group.equal = "lv.covariances" or group.equal = "residual.covariances" will constrain any autocovariances across groups, along with any other covariances the user specified in the configural.model. Note also that autocovariances cannot be specified as exceptions in long.partial, so anything more complex than the auto argument automatically provides should instead be manually specified in the configural.model.

When users set orthogonal=TRUE in the configural.model (e.g., in bifactor models of repeatedly measured constructs), autocovariances of each repeatedly measured factor will still be freely estimated in the generated syntax.

Missing Data: If users wish to utilize the auxiliary function to automatically include auxiliary variables in conjunction with missing = "FIML", they should first generate the hypothesized-model syntax, then submit that syntax as the model to auxiliary(). If users utilized runMI to fit their configural.model to multiply imputed data, that model can also be passed to the configural.model argument, and if return.fit = TRUE, the generated model will be fitted to the multiple imputations.

Value

By default, an object of class measEq.syntax. If return.fit = TRUE, a fitted lavaan model, with the measEq.syntax object stored in the @external slot, accessible by fit@external$measEq.syntax.

Author(s)

Terrence D. Jorgensen (University of Amsterdam; TJorgensen314@gmail.com)

References

Kloessner, S., & Klopp, E. (2019). Explaining constraint interaction: How to interpret estimated model parameters under alternative scaling methods. Structural Equation Modeling, 26(1), 143–155. doi: 10.1080/10705511.2018.1517356

Liu, Y., Millsap, R. E., West, S. G., Tein, J.-Y., Tanaka, R., & Grimm, K. J. (2017). Testing measurement invariance in longitudinal data with ordered-categorical measures. Psychological Methods, 22(3), 486–506. doi: 10.1037/met0000075

Millsap, R. E., & Tein, J.-Y. (2004). Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research, 39(3), 479–515. doi: 10.1207/S15327906MBR3903_4

Wu, H., & Estabrook, R. (2016). Identification of confirmatory factor analysis models of different levels of invariance for ordered categorical outcomes. Psychometrika, 81(4), 1014–1045. doi: 10.1007/s11336-016-9506-0

See Also

compareFit

Examples

mod.cat <- ' FU1 =~ u1 + u2 + u3 + u4
             FU2 =~ u5 + u6 + u7 + u8 '
## the 2 factors are actually the same factor (FU) measured twice
longFacNames <- list(FU = c("FU1","FU2"))

## CONFIGURAL model: no constraints across groups or repeated measures
syntax.config <- measEq.syntax(configural.model = mod.cat,
                               # NOTE: data provides info about numbers of
                               #       groups and thresholds
                               data = datCat,
                               ordered = paste0("u", 1:8),
                               parameterization = "theta",
                               ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
                               group = "g", longFacNames = longFacNames)
## print lavaan syntax to the Console
cat(as.character(syntax.config))
## print a summary of model features
summary(syntax.config)

## THRESHOLD invariance:
## only necessary to specify thresholds if you have no data
mod.th <- '
  u1 | t1 + t2 + t3 + t4
  u2 | t1 + t2 + t3 + t4
  u3 | t1 + t2 + t3 + t4
  u4 | t1 + t2 + t3 + t4
  u5 | t1 + t2 + t3 + t4
  u6 | t1 + t2 + t3 + t4
  u7 | t1 + t2 + t3 + t4
  u8 | t1 + t2 + t3 + t4
'
syntax.thresh <- measEq.syntax(configural.model = c(mod.cat, mod.th),
                               # NOTE: data not provided, so syntax must
                               #       include thresholds, and number of
                               #       groups == 2 is indicated by:
                               sample.nobs = c(1, 1),
                               parameterization = "theta",
                               ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
                               group = "g", group.equal = "thresholds",
                               longFacNames = longFacNames,
                               long.equal = "thresholds")
## notice that constraining 4 thresholds allows intercepts and residual
## variances to be freely estimated in all but the first group & occasion
cat(as.character(syntax.thresh))
## print a summary of model features
summary(syntax.thresh)


## Fit a model to the data either in a subsequent step (recommended):
mod.config <- as.character(syntax.config)
fit.config <- cfa(mod.config, data = datCat, group = "g",
                  ordered = paste0("u", 1:8), parameterization = "theta")
## or in a single step (not generally recommended):
fit.thresh <- measEq.syntax(configural.model = mod.cat, data = datCat,
                            ordered = paste0("u", 1:8),
                            parameterization = "theta",
                            ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
                            group = "g", group.equal = "thresholds",
                            longFacNames = longFacNames,
                            long.equal = "thresholds", return.fit = TRUE)
## compare their fit to test threshold invariance
anova(fit.config, fit.thresh)


## --------------------------------------------------------
## RECOMMENDED PRACTICE: fit one invariance model at a time
## --------------------------------------------------------

## - A downside of setting return.fit=TRUE is that if the model has trouble
##   converging, you don't have the opportunity to investigate the syntax,
##   or even to know whether an error resulted from the syntax-generator or
##   from lavaan itself.
## - A downside of automatically fitting an entire set of invariance models
##   (like the old measurementInvariance() function did) is that you might
##   end up testing models that shouldn't even be fitted because less
##   restrictive models already fail (e.g., don't test full scalar
##   invariance if metric invariance fails! Establish partial metric
##   invariance first, then test equivalent of intercepts ONLY among the
##   indicators that have invariate loadings.)

## The recommended sequence is to (1) generate and save each syntax object,
## (2) print it to the screen to verify you are fitting the model you expect
## to (and potentially learn which identification constraints should be
## released when equality constraints are imposed), and (3) fit that model
## to the data, as you would if you had written the syntax yourself.

## Continuing from the examples above, after establishing invariance of
## thresholds, we proceed to test equivalence of loadings and intercepts
##   (metric and scalar invariance, respectively)
## simultaneously across groups and repeated measures.

## Not run: 

## metric invariance
syntax.metric <- measEq.syntax(configural.model = mod.cat, data = datCat,
                               ordered = paste0("u", 1:8),
                               parameterization = "theta",
                               ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
                               group = "g", longFacNames = longFacNames,
                               group.equal = c("thresholds","loadings"),
                               long.equal  = c("thresholds","loadings"))
summary(syntax.metric)                    # summarize model features
mod.metric <- as.character(syntax.metric) # save as text
cat(mod.metric)                           # print/view lavaan syntax
## fit model to data
fit.metric <- cfa(mod.metric, data = datCat, group = "g",
                  ordered = paste0("u", 1:8), parameterization = "theta")
## test equivalence of loadings, given equivalence of thresholds
anova(fit.thresh, fit.metric)

## scalar invariance
syntax.scalar <- measEq.syntax(configural.model = mod.cat, data = datCat,
                               ordered = paste0("u", 1:8),
                               parameterization = "theta",
                               ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
                               group = "g", longFacNames = longFacNames,
                               group.equal = c("thresholds","loadings",
                                               "intercepts"),
                               long.equal  = c("thresholds","loadings",
                                               "intercepts"))
summary(syntax.scalar)                    # summarize model features
mod.scalar <- as.character(syntax.scalar) # save as text
cat(mod.scalar)                           # print/view lavaan syntax
## fit model to data
fit.scalar <- cfa(mod.scalar, data = datCat, group = "g",
                  ordered = paste0("u", 1:8), parameterization = "theta")
## test equivalence of intercepts, given equal thresholds & loadings
anova(fit.metric, fit.scalar)


## For a single table with all results, you can pass the models to
## summarize to the compareFit() function
compareFit(fit.config, fit.thresh, fit.metric, fit.scalar)



## ------------------------------------------------------
## NOT RECOMMENDED: fit several invariance models at once
## ------------------------------------------------------
test.seq <- c("thresholds","loadings","intercepts","means","residuals")
meq.list <- list()
for (i in 0:length(test.seq)) {
  if (i == 0L) {
    meq.label <- "configural"
    group.equal <- ""
    long.equal <- ""
  } else {
    meq.label <- test.seq[i]
    group.equal <- test.seq[1:i]
    long.equal <- test.seq[1:i]
  }
  meq.list[[meq.label]] <- measEq.syntax(configural.model = mod.cat,
                                         data = datCat,
                                         ordered = paste0("u", 1:8),
                                         parameterization = "theta",
                                         ID.fac = "std.lv",
                                         ID.cat = "Wu.Estabrook.2016",
                                         group = "g",
                                         group.equal = group.equal,
                                         longFacNames = longFacNames,
                                         long.equal = long.equal,
                                         return.fit = TRUE)
}

compareFit(meq.list)


## -----------------
## Binary indicators
## -----------------

## borrow example data from Mplus user guide
myData <- read.table("http://www.statmodel.com/usersguide/chap5/ex5.16.dat")
names(myData) <- c("u1","u2","u3","u4","u5","u6","x1","x2","x3","g")
bin.mod <- '
  FU1 =~ u1 + u2 + u3
  FU2 =~ u4 + u5 + u6
'
## Must SIMULTANEOUSLY constrain thresholds, loadings, and intercepts
test.seq <- list(strong = c("thresholds","loadings","intercepts"),
                 means = "means",
                 strict = "residuals")
meq.list <- list()
for (i in 0:length(test.seq)) {
  if (i == 0L) {
    meq.label <- "configural"
    group.equal <- ""
    long.equal <- ""
  } else {
    meq.label <- names(test.seq)[i]
    group.equal <- unlist(test.seq[1:i])
    # long.equal <- unlist(test.seq[1:i])
  }
  meq.list[[meq.label]] <- measEq.syntax(configural.model = bin.mod,
                                         data = myData,
                                         ordered = paste0("u", 1:6),
                                         parameterization = "theta",
                                         ID.fac = "std.lv",
                                         ID.cat = "Wu.Estabrook.2016",
                                         group = "g",
                                         group.equal = group.equal,
                                         #longFacNames = longFacNames,
                                         #long.equal = long.equal,
                                         return.fit = TRUE)
}

compareFit(meq.list)


## ---------------------
## Multilevel Invariance
## ---------------------

## To test invariance across levels in a MLSEM, specify syntax as though
## you are fitting to 2 groups instead of 2 levels.

mlsem <- ' f1 =~ y1 + y2 + y3
           f2 =~ y4 + y5 + y6 '
## metric invariance
syntax.metric <- measEq.syntax(configural.model = mlsem, meanstructure = TRUE,
                               ID.fac = "std.lv", sample.nobs = c(1, 1),
                               group = "cluster", group.equal = "loadings")
## by definition, Level-1 means must be zero, so fix them
syntax.metric <- update(syntax.metric,
                        change.syntax = paste0("y", 1:6, " ~ c(0, NA)*1"))
## save as a character string
mod.metric <- as.character(syntax.metric, groups.as.blocks = TRUE)
## convert from multigroup to multilevel
mod.metric <- gsub(pattern = "group:", replacement = "level:",
                   x = mod.metric, fixed = TRUE)
## fit model to data
fit.metric <- lavaan(mod.metric, data = Demo.twolevel, cluster = "cluster")
summary(fit.metric)

## End(Not run)

semTools documentation built on May 10, 2022, 9:05 a.m.