run.fitprop: Run fit propensity analyses
In falkcarl/ockhamSEM: Tools for studying fit propensity in structural equation modeling

View source: R/fitprop.R

run.fitprop

R Documentation

Run fit propensity analyses

Description

Run fit propensity analyses

Usage

run.fitprop(
  ...,
  fit.measure = "srmr",
  rmethod = c("onion", "mcmc", "clustergen"),
  reps = 1000,
  onlypos = FALSE,
  seed = 1234,
  mcmc.args = list(),
  clustergen.args = list(),
  saveModel = FALSE,
  saveR = FALSE,
  cluster = NULL
)

Arguments

`...`	Models of class lavaan for which the user would like to compare fit propensity.
`fit.measure`	Character vector that indicates which fit measure to extract from fitted models. Possible options include anything returned by `fitMeasures` from the lavaan package when applied to fitted models.
`rmethod`	String indicating the type of random correlation generation approach. Choices are `"mcmc"` (default), `"onion"`, and `"clustergen"`. See details.
`reps`	Number of random correlation matrices to generate for fit propensity analysis.
`onlypos`	Logical value indicating whether to generate correlation matrices. Note that if there are many variables, generation of correlation matrices and fitting models to them will be very, very computationally intensive. with positive manifold (`TRUE`); i.e., only positive relationships among variables.
`seed`	Random number seed used by set.seed or parallel package.
`mcmc.args`	Named list of arguments that controls options for `"mcmc"` correlation matrix generation. See details.
`clustergen.args`	Named list of arguments that controls generation of correlation matrices if `"onion"` or `"clustergen"` is used. See details.
`saveModel`	Logical value indicating whether the to save fitted models for later examination.
`saveR`	Logical value indicating whether to save randomly generated correlation matrices.
`cluster`	(Optional) A cluster created by `makeCluster` from the parallel package. If provided, computations will be parallelized as much as possible.

Details

Inspired by work by Preacher (2003, 2006) and Bonifay & Cai (2017), this function performs three steps for analyses to assess the fit propensity of competing structural equation models: 1. Randomly generate correlation (or covariance matrices); 2. Fit models to each correlation matrix; and 3. Save a indices that could be used for evaluating model fit in subsequent summaries. Conceptually, models that exhibit better fit to such randomly generated data may have better fit propensity, and are therefore potentially less parsimonious.

Analyses are performed with the lavaan package, and fitted lavaan models of lavaan-class (e.g., created from cfa, sem, or lavaan functions) for the competing models must be passed as initial arguments to the function. Currently, only single-group models and those relying on ML estimation are supported. Otherwise, the underlying options for the fitted lavaan models will be re-used by the run.fitprop function for the fit propensity analyses. It is optional to save the randomly generated matrices from Step 1 and the models fit in Step 2. Follow-up summaries of results saved in Step 3 are provided by plot.fitprop and summary.fitprop functions.

Generation of random correlation matrices is provided using several approaches. The "mcmc" algorithm implements a Markov Chain Monte Carlo approach and was ported from Fortran code in Preacher (2003). For details on the algorithm's actual implementation, see Preacher (2003), Falk and Muthukrishna (in prep), or the source code for the mcmc function. If this algorithm is chosen, mcmc.args accepts a list that can modify some default settings. In particular, iter sets the total number of iterations to run (default = 5000000). If parallel processing is enabled, this number will be divided amonst the number of chains. miniter sets a minimum number of iterations per chain to avoid many processors leading to too few iterations per chain (default = 10000). jmpsize overrides the step size for each update to the candidate correlation matrix. Smaller step sizes typically lead to more acceptance and may be necessary for larger correlation matrices (default jump size depends on the number of variables). Though, in general the MCMC algorithm becomes more difficult to work well with many variables.

The "onion" method is one approach that relies on work of Joe (2006) and Lewandowski, Kurowick, and Joe (2009); matrices are generated recursively, one variable at a time. The onion method is computationally more efficient than the MCMC algorithm. Under the hood, the genPositiveDefMat function in the clusterGeneration package is used, with default arguments of covMethod="onion", eta=1, and rangeVar=c(1,1). These arguments ensure that the Onion method is used, generation is uniform over the space of positive definite matrices (but see note on positive manifold below), and with unit variances.

An additional option "clustergen" is provided for direct interface with the genPositiveDefMat function in the clusterGeneration package. A named list can be passed to clustergen.args to override any defaults used by genPositiveDefMat, and the user is referred to documentation for that function. This allows, for example, generation using C-Vines, covariance matrices (i.e., variables that do not all have unit variances), and several other covaraince/correlation matrix generation techniques.

onlypos controls whether correlation matrices can have only positive correlations. The original MCMC algorith by Preacher (2003, 2006) generated correlation matrices with positive manifold only (i.e., only positive correlations). The algorithm is easily changed to allow also negative correlations. The Onion method and any functions from clusterGeneration by default generate matrices with both positive and negative correlations. To obtain matrices with positive manifold only, an ad-hoc correction is implemented for these latter approaches where the matrix is transformed: R = (R+1)/2. To our knowledge, there is no guarantee that this will result in uniform sampling from the space of all correlation matrices with positive manifold, yet fit propensity results for some examples are very similar to those of the MCMC algorithm.

Value

An object of class fitprop for which plot and summary methods are available. Some slots are listed in the Slots section.

Slots

fit_list: A list of the same length as the number of models being compared. Each list contains a matrix with columns corresponding to each entry of fit.measure and for all replications.
R: A list of the same length as the number of replications, containing all correlation matrices that were used for the fit propensity analysis. This slot is only populated if SaveR is set to TRUE.
mod_list: A list of the same length as the number of models being compared. Each element contains a list of the same length as the number of replications and contains fitted lavaan models. Only populated if saveModel is set to TRUE.

References

Bonifay, W. E., & Cai, L. (2017). On the complexity of item response theory models. Multivariate Behavioral Research, 52(4), 465–484. http://doi.org/10.1080/00273171.2017.1309262

Falk, C. F., & Muthukrishna, M. (in press). Parsimony in model selection: Tools for assessing fit propensity. Psychological Methods.

Lewandowski, D., Kurowicka, D., & Joe, H. (2009). Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis,100(9), 1989–2001. http://doi.org/10.1016/j.jmva.2009.04.008

Joe, H. (2006). Generating random correlation matrices based on partial correlations. Journal of Multivariate Analysis, 97(10), 2177–2189. http://doi.org/10.1016/j.jmva.2005.05.010

Preacher, K. J. (2003). The role of model complexity in the evaluation of structural equation models (PhD thesis). The Ohio State University.

Preacher, K. J. (2006). Quantifying parsimony in structural equation modeling. Multivariate Behavioral Research, 41(3), 227–259. http://doi.org/10.1207/s15327906mbr4103_1

Examples


# Set up a covariance matrix to fit models to
p<-3 # number of variables
temp_mat <- diag(p) # identity matrix
colnames(temp_mat) <- rownames(temp_mat) <- paste0("V", seq(1, p))

# Define and fit two models using lavaan package
mod1a <- 'V3 ~ V1 + V2
  V1 ~~ 0*V2'
mod2a <- 'V3 ~ V1
  V2 ~ V3'

mod1a.fit <- sem(mod1a, sample.cov=temp_mat, sample.nobs=500)
mod2a.fit <- sem(mod2a, sample.cov=temp_mat, sample.nobs=500)

# Run fit propensity analysis a variety of different ways

# Onion approach, only positive correlation matrices, save srmr
res <- run.fitprop(mod1a.fit, mod2a.fit, fit.measure="srmr",
  rmethod="onion",reps=1000,onlypos=TRUE)
summary(res)

# Onion approach, save several fit indices
res <- run.fitprop(mod1a.fit, mod2a.fit, fit.measure=c("srmr","cfi","rmsea"),
  rmethod="onion",reps=1000)
summary(res)

# mcmc approach, with parallel processing (4 cores)
# Save and then access correlation matrices and fitted models
# Note: this will take a very long time as the default number
# of iterations is set to a lot
cl<-makeCluster(2)
res <- run.fitprop(mod1a.fit, mod2a.fit, fit.measure="srmr",
  rmethod="mcmc",reps=1000,cluster=cl,saveModel=TRUE,saveR=TRUE)
stopCluster(cl)
summary(res)
res$R[[1]] # Correlation matrix for first replication
res$fit_list[[1]] # Saved fit indices for first model
res$mod_list[[1]][[2]] # Fitted lavaan model for first model, second replication

# mcmc approach, overriding defaults
ctrl<-list(
  iter = 10000, # total number of iterations
  miniter = 5000, # but, min number of iterations per chain
  jmpsize = .3
)
cl<-makeCluster(2)
res <- run.fitprop(mod1a.fit, mod2a.fit, fit.measure="srmr",
  rmethod="mcmc",reps=1000,cluster=cl, mcmc.args=ctrl)
stopCluster(cl)
summary(res)

falkcarl/ockhamSEM documentation built on June 23, 2024, 4:25 a.m.