PAVranking: Parcel-Allocation Variability in Model Ranking

View source: R/PAVranking.R

PAVrankingR Documentation

Parcel-Allocation Variability in Model Ranking

Description

This function quantifies and assesses the consequences of parcel-allocation variability for model ranking of structural equation models (SEMs) that differ in their structural specification but share the same parcel-level measurement specification (see Sterba & Rights, 2016). This function calls parcelAllocation()—which can be used with only one SEM in isolation—to fit two (assumed) nested models to each of a specified number of random item-to-parcel allocations. Output includes summary information about the distribution of model selection results (including plots) and the distribution of results for each model individually, across allocations within-sample. Note that this function can be used when selecting among more than two competing structural models as well (see instructions below involving the ⁠seed=⁠ argument).

Usage

PAVranking(model0, model1, data, parcel.names, item.syntax, nAlloc = 100,
  fun = "sem", alpha = 0.05, bic.crit = 10, fit.measures = c("chisq",
  "df", "cfi", "tli", "rmsea", "srmr", "logl", "aic", "bic", "bic2"), ...,
  show.progress = FALSE, iseed = 12345, warn = FALSE)

Arguments

model0, model1

lavaan::lavaan() model syntax specifying nested models (model0 within model1) to be fitted to the same parceled data. Note that there can be a mixture of items and parcels (even within the same factor), in case certain items should never be parceled. Can be a character string or parameter table. Also see lavaan::lavaanify() for more details.

data

A data.frame containing all observed variables appearing in ⁠model0=⁠ and ⁠model1=⁠, as well as those in the ⁠item.syntax=⁠ used to create parcels. If the data have missing values, multiple imputation before parceling is recommended: submit a stacked data set (with a variable for the imputation number, so they can be separated later) and set do.fit=FALSE to return the list of data.frames (one per allocation), each of which is a stacked, multiply imputed data set with parcels created using the same allocation scheme.

parcel.names

character vector containing names of all parcels appearing as indicators in ⁠model0=⁠ or ⁠model1=⁠.

item.syntax

lavaan::lavaan() model syntax specifying the model that would be fit to all of the unparceled items, including items that should be randomly allocated to parcels appearing in ⁠model0=⁠ and ⁠model1=⁠.

nAlloc

The number of random items-to-parcels allocations to generate.

fun

character string indicating the name of the lavaan::lavaan() function used to fit ⁠model0=⁠ and ⁠model1=⁠ to ⁠data=⁠. Can only take the values "lavaan", "sem", "cfa", or "growth".

alpha

Alpha level used as criterion for significance.

bic.crit

Criterion for assessing evidence in favor of one model over another. See Rafferty (1995) for guidelines (default is "very strong evidence" in favor of the model with lower BIC).

fit.measures

character vector containing names of fit measures to request from each fitted lavaan::lavaan model. See the output of lavaan::fitMeasures() for a list of available measures.

...

Additional arguments to be passed to lavaan::lavaanList(). See also lavaan::lavOptions()

show.progress

If TRUE, show a utils::txtProgressBar() indicating how fast each model-fitting iterates over allocations.

iseed

(Optional) Random seed used for parceling items. When the same random seed is specified and the program is re-run, the same allocations will be generated. The seed argument can be used to assess parcel-allocation variability in model ranking when considering more than two models. For each pair of models under comparison, the program should be rerun using the same random seed. Doing so ensures that multiple model comparisons will employ the same set of parcel datasets. Note: When using parallel options, you must first type RNGkind("L'Ecuyer-CMRG") into the R Console, so that the seed will be controlled across cores.

warn

Whether to print warnings when fitting models to each allocation

Details

This is based on a SAS macro ParcelAlloc (Sterba & MacCallum, 2010). The PAVranking() function produces results discussed in Sterba and Rights (2016) relevant to the assessment of parcel-allocation variability in model selection and model ranking. Specifically, the PAVranking() function first calls parcelAllocation() to generate a given number (⁠nAlloc=⁠) of item-to-parcel allocations, fitting both specified models to each allocation, and providing summaryies of PAV for each model. Additionally, PAVranking() provides the following new summaries:

  • PAV in model selection index values and model ranking between Models ⁠model0=⁠ and ⁠model1=⁠.

  • The proportion of allocations that converged and the proportion of proper solutions (results are summarized for allocations with both converged and proper allocations only).

For further details on the benefits of the random allocation of items to parcels, see Sterba (2011) and Sterba and MacCallum (2010).

To test whether nested models have equivalent fit, results can be pooled across allocations using the same methods available for pooling results across multiple imputations of missing data (see Examples).

Note: This function requires the lavaan package. Missing data must be coded as NA. If the function returns "Error in plot.new() : figure margins too large", the user may need to increase size of the plot window (e.g., in RStudio) and rerun the function.

Value

A list with 3 elements. The first two (model0.results and model1.results) are results returned by parcelAllocation() for model0 and model1, respectively. The third element (model0.v.model1) is a list of model-comparison results, including the following:

\verb{LRT_Summary:}

The average likelihood ratio test across allocations, as well as the SD, minimum, maximum, range, and the proportion of allocations for which the test was significant.

\verb{Fit_Index_Differences:}

Differences in fit indices, organized by what proportion favored each model and among those, what the average difference was.

\verb{Favored_by_BIC:}

The proportion of allocations in which each model met the criterion (bic.crit) for a substantial difference in fit.

\verb{Convergence_Summary:}

The proportion of allocations in which each model (and both models) converged on a solution.

Histograms are also printed to the current plot-output device.

Author(s)

Terrence D. Jorgensen (University of Amsterdam; TJorgensen314@gmail.com)

References

Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111–163. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.2307/271063")}

Sterba, S. K. (2011). Implications of parcel-allocation variability for comparing fit of item-solutions and parcel-solutions. Structural Equation Modeling, 18(4), 554–577.\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/10705511.2011.607073")}

Sterba, S. K., & MacCallum, R. C. (2010). Variability in parameter estimates and model fit across repeated allocations of items to parcels. Multivariate Behavioral Research, 45(2), 322–358. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/00273171003680302")}

Sterba, S. K., & Rights, J. D. (2016). Accounting for parcel-allocation variability in practice: Combining sources of uncertainty and choosing the number of allocations. Multivariate Behavioral Research, 51(2–3), 296–313. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/00273171.2016.1144502")}

Sterba, S. K., & Rights, J. D. (2017). Effects of parceling on model selection: Parcel-allocation variability in model ranking. Psychological Methods, 22(1), 47–68. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1037/met0000067")}

See Also

parcelAllocation() for fitting a single model, poolMAlloc() for choosing the number of allocations

Examples


## Specify the item-level model (if NO parcels were created)
## This must apply to BOTH competing models

item.syntax <- c(paste0("f1 =~ f1item", 1:9),
                 paste0("f2 =~ f2item", 1:9))
cat(item.syntax, sep = "\n")
## Below, we reduce the size of this same model by
## applying different parceling schemes

## Specify a 2-factor CFA with correlated factors, using 3-indicator parcels
mod1 <- '
f1 =~ par1 + par2 + par3
f2 =~ par4 + par5 + par6
'
## Specify a more restricted model with orthogonal factors
mod0 <- '
f1 =~ par1 + par2 + par3
f2 =~ par4 + par5 + par6
f1 ~~ 0*f2
'
## names of parcels (must apply to BOTH models)
(parcel.names <- paste0("par", 1:6))


## override default random-number generator to use parallel options
RNGkind("L'Ecuyer-CMRG")

PAVranking(model0 = mod0, model1 = mod1, data = simParcel, nAlloc = 100,
           parcel.names = parcel.names, item.syntax = item.syntax,
           # parallel = "multicore",   # parallel available on Mac/Linux
           std.lv = TRUE)       # any addition lavaan arguments



## POOL RESULTS by treating parcel allocations as multiple imputations.
## Details provided in Sterba & Rights (2016); see ?poolMAlloc.

## save list of data sets instead of fitting model yet
dataList <- parcelAllocation(mod0, # or mod1 (either uses same allocations)
                             data = simParcel, nAlloc = 100,
                             parcel.names = parcel.names,
                             item.syntax = item.syntax,
                             do.fit = FALSE)
## now fit each model to each data set
if(requireNamespace("lavaan.mi")){
  library(lavaan.mi)
  fit0 <- cfa.mi(mod0, data = dataList, std.lv = TRUE)
  fit1 <- cfa.mi(mod1, data = dataList, std.lv = TRUE)
  anova(fit0, fit1)           # Pooled test statistic comparing models.
  help(package = "lavaan.mi") # Find more methods for pooling results.
}




semTools documentation built on April 3, 2025, 9:23 p.m.