item.cfa | R Documentation |
This function is a wrapper function for conducting confirmatory factor analysis
with continuous and/or ordered-categorical indicators by calling the cfa
function in the R package lavaan.
item.cfa(..., data = NULL, model = NULL, rescov = NULL, hierarch = FALSE,
meanstructure = TRUE, ident = c("marker", "var", "effect"),
parameterization = c("delta", "theta"), ordered = NULL, cluster = NULL,
estimator = c("ML", "MLM", "MLMV", "MLMVS", "MLF", "MLR",
"GLS", "WLS", "DWLS", "WLSM", "WLSMV",
"ULS", "ULSM", "ULSMV", "DLS", "PML"),
missing = c("listwise", "pairwise", "fiml",
"two.stage", "robust.two.stage", "doubly.robust"),
print = c("all", "summary", "coverage", "descript", "fit", "est",
"modind", "resid"),
mod.minval = 6.63, resid.minval = 0.1, digits = 3, p.digits = 3,
as.na = NULL, write = NULL, append = TRUE, check = TRUE, output = TRUE)
... |
a matrix or data frame. If |
data |
a data frame when specifying one or more variables in the
argument |
model |
a character vector specifying a measurement model with
one factor, or a list of character vectors for specifying
a measurement model with more than one factor, e.g.,
|
rescov |
a character vector or a list of character vectors for
specifying residual covariances, e.g.
|
hierarch |
logical: if |
meanstructure |
logical: if |
ident |
a character string indicating the method used for
identifying and scaling latent variables, i.e.,
|
parameterization |
a character string indicating the method used for
identifying and scaling latent variables when indicators
are ordered, i.e., |
ordered |
if |
cluster |
either a character string indicating the variable name
of the cluster variable in |
estimator |
a character string indicating the estimator to be used
(see 'Details'). By default, |
missing |
a character string indicating how to deal with missing
data, i.e., |
print |
a character string or character vector indicating which
results to show on the console, i.e. |
mod.minval |
numeric value to filter modification indices and only
show modifications with a modification index value equal
or higher than this minimum value. By default, modification
indices equal or higher 6.63 are printed. Note that a
modification index value of 6.63 is equivalent to a
significance level of |
resid.minval |
numeric value indicating the minimum absolute residual correlation coefficients and standardized means to highlight in boldface. By default, absolute residual correlation coefficients and standardized means equal or higher 0.1 are highlighted. Note that highlighting can be disabled by setting the minimum value to 1. |
digits |
an integer value indicating the number of decimal places to be used for displaying results. |
p.digits |
an integer value indicating the number of decimal places to be used for displaying the p-value. |
as.na |
a numeric vector indicating user-defined missing values,
i.e. these values are converted to |
write |
a character string naming a file for writing the output into
either a text file with file extension |
append |
logical: if |
check |
logical: if |
output |
logical: if |
The R package lavaan provides seven estimators
that affect the estimation, namely "ML"
, "GLS"
, "WLS"
,
"DWLS"
, "ULS"
, "DLS"
, and "PML"
. All other options
for the argument estimator
combine these estimators with various standard
error and chi-square test statistic computation. Note that the estimators also
differ in how missing values can be dealt with (e.g., listwise deletion,
pairwise deletion, or full information maximum likelihood, FIML).
"ML"
: Maximum likelihood parameter estimates with conventional standard errors
and conventional test statistic. For both complete and incomplete data
using pairwise deletion or FIML.
"MLM"
: Maximum likelihood parameter estimates with conventional
robust standard errors and a Satorra-Bentler scaled test statistic that
are robust to non-normality. For complete data only.
"MLMV"
: Maximum likelihood parameter estimates with conventional
robust standard errors and a mean and a variance adjusted test statistic
using a scale-shifted approach that are robust to non-normality. For complete
data only.
"MLMVS"
: Maximum likelihood parameter estimates with conventional
robust standard errors and a mean and a variance adjusted test statistic
using the Satterthwaite approach that are robust to non-normality. For complete
data only.
"MLF"
: Maximum likelihood parameter estimates with standard
errors approximated by first-order derivatives and conventional test statistic.
For both complete and incomplete data using pairwise deletion or FIML.
"MLR"
: Maximum likelihood parameter estimates with Huber-White
robust standard errors a test statistic which is asymptotically equivalent
to the Yuan-Bentler T2* test statistic that are robust to non-normality
and non-independence of observed when specifying a cluster variable using
the argument cluster
. For both complete and incomplete data using
pairwise deletion or FIML.
"GLS"
: Generalized least squares parameter estimates with
conventional standard errors and conventional test statistic that uses a
normal-theory based weight matrix. For complete data only.
and conventional chi-square test. For both complete and incomplete data.
"WLS"
: Weighted least squares parameter estimates (sometimes
called ADF estimation) with conventional standard errors and conventional
test statistic that uses a full weight matrix. For complete data only.
"DWLS"
: Diagonally weighted least squares parameter estimates
which uses the diagonal of the weight matrix for estimation with conventional
standard errors and conventional test statistic. For both complete and
incomplete data using pairwise deletion.
"WLSM"
: Diagonally weighted least squares parameter estimates
which uses the diagonal of the weight matrix for estimation, but uses the
full weight matrix for computing the conventional robust standard errors
and a Satorra-Bentler scaled test statistic. For both complete and incomplete
data using pairwise deletion.
"WLSMV"
: Diagonally weighted least squares parameter estimates
which uses the diagonal of the weight matrix for estimation, but uses the
full weight matrix for computing the conventional robust standard errors
and a mean and a variance adjusted test statistic using a scale-shifted
approach. For both complete and incomplete data using pairwise deletion.
"ULS"
: Unweighted least squares parameter estimates with
conventional standard errors and conventional test statistic. For both
complete and incomplete data using pairwise deletion.
"ULSM"
: Unweighted least squares parameter estimates with
conventional robust standard errors and a Satorra-Bentler scaled test
statistic. For both complete and incomplete data using pairwise deletion.
"ULSMV"
: Unweighted least squares parameter estimates with
conventional robust standard errors and a mean and a variance adjusted
test statistic using a scale-shifted approach. For both complete and
incomplete data using pairwise deletion.
"DLS"
: Distributionally-weighted least squares parameter
estimates with conventional robust standard errors and a Satorra-Bentler
scaled test statistic. For complete data only.
"PML"
: Pairwise maximum likelihood parameter estimates
with Huber-White robust standard errors and a mean and a variance adjusted
test statistic using the Satterthwaite approach. For both complete and
incomplete data using pairwise deletion.
The R package lavaan provides six methods for dealing with missing data:
"listwise"
: Listwise deletion, i.e., all cases with missing
values are removed from the data before conducting the analysis. This is
only valid if the data are missing completely at random (MCAR).
"pairwise"
: Pairwise deletion, i.e., each element of a
variance-covariance matrix is computed using cases that have data needed
for estimating that element. This is only valid if the data are missing
completely at random (MCAR).
"fiml"
: Full information maximum likelihood (FIML) method,
i.e., likelihood is computed case by case using all available data from
that case. FIML method is only applicable for following estimators:
"ML"
, "MLF"
, and "MLR"
.
"two.stage"
: Two-stage maximum likelihood estimation, i.e.,
sample statistics is estimated using EM algorithm in the first step. Then,
these estimated sample statistics are used as input for a regular analysis.
Standard errors and test statistics are adjusted correctly to reflect the
two-step procedure. Two-stage method is only applicable for following
estimators: "ML"
, "MLF"
, and "MLR"
.
"robust.two.stage"
: Robust two-stage maximum likelihood
estimation, i.e., two-stage maximum likelihood estimation with standard
errors and a test statistic that are robust against non-normality. Robust
two-stage method is only applicable for following estimators: "ML"
,
"MLF"
, and "MLR"
.
"doubly.robust"
: Doubly-robust method only applicable for
pairwise maximum likelihood estimation (i.e., estimator = "PML"
.
In line with the R package lavaan, this functions provides several checks for model convergence and model identification:
Degrees of freedom
: An error message is printed if the number
of degrees of freedom is negative, i.e., the model is not identified.
Model convergence
: An error message is printed if the
optimizer has not converged, i.e., results are most likely unreliable.
Standard errors
: An error message is printed if the standard
errors could not be computed, i.e., the model might not be identified.
Variance-covariance matrix of the estimated parameters
: A
warning message is printed if the variance-covariance matrix of the
estimated parameters is not positive definite, i.e., the smallest eigenvalue
of the matrix is smaller than zero or very close to zero.
Negative variances of observed variables
: A warning message
is printed if the estimated variances of the observed variables are
negative.
Variance-covariance matrix of observed variables
: A warning
message is printed if the estimated variance-covariance matrix of the
observed variables is not positive definite, i.e., the smallest eigenvalue
of the matrix is smaller than zero or very close to zero.
Negative variances of latent variables
: A warning message
is printed if the estimated variances of the latent variables are
negative.
Variance-covariance matrix of latent variables
: A warning
message is printed if the estimated variance-covariance matrix of the
latent variables is not positive definite, i.e., the smallest eigenvalue
of the matrix is smaller than zero or very close to zero.
Note that unlike the R package lavaan, the item.cfa
function does
not provide any results when the degrees of freedom is negative, the model
has not converged, or standard errors could not be computed.
The item.cfa
function provides the chi-square
test, incremental fit indices (i.e., CFI and TLI), and absolute fit indices
(i.e., RMSEA, and SRMR) to evaluate overall model fit. However, different
versions of the CFI, TLI, and RMSEA are provided depending on the estimator.
Unlike the R package lavaan, the different versions are labeled with
Standard
, Scaled
, and Robust
in the output:
"Standard"
: CFI, TLI, and RMSEA without any non-normality
corrections. These fit measures based on the normal theory maximum
likelihood test statistic are sensitive to deviations from multivariate
normality of endogenous variables. Simulation studies by Brosseau-Liard
et al. (2012), and Brosseau-Liard and Savalei (2014) showed that the
uncorrected fit indices are affected by non-normality, especially at small
and medium sample sizes (e.g., n < 500).
"Scaled"
: Population-corrected robust CFI, TLI, and RMSEA
with ad hoc non-normality corrections that simply replace the maximum
likelihood test statistic with a robust test statistic (e.g., mean-adjusted
chi-square). These fit indices change the population value being estimated
depending on the degree of non-normality present in the data. Brosseau-Liard
et al. (2012) demonstrated that the ad hoc corrected RMSEA increasingly
accepts poorly fitting models as non-normality in the data increases, while
the effect of the ad hoc correction on the CFI and TLI is less predictable
with non-normality making fit appear worse, better, or nearly unchanged
(Brosseau-Liard & Savalei, 2014).
"Robust"
: Sample-corrected robust CFI, TLI, and RMSEA
with non-normality corrections based on formula provided by Li and Bentler
(2006) and Brosseau-Liard and Savalei (2014). These fit indices do not
change the population value being estimated and can be interpreted the
same way as the uncorrected fit indices when the data would have been
normal.
In conclusion, the use of sample-corrected fit indices (Robust
)
instead of population-corrected fit indices (Scaled
) is recommended.
Note that when sample size is very small (e.g., n < 200), non-normality
correction does not appear to adjust fit indices sufficiently to counteract
the effect of non-normality (Brosseau-Liard & Savalei, 2014).
The item.cfa
function provides modification indices and the residual correlation matrix when
requested by using the print
argument. Modification indices (aka score
tests) are univariate Lagrange Multipliers (LM) representing a chi-square
statistic with a single degree of freedom. LM approximates the amount by which
the chi-square test statistic would decrease if a fixed or constrained parameter
is freely estimated (Kline, 2023). However, (standardized) expected parameter
change (EPC) values should also be inspected since modification indices are
sensitive to sample size. EPC values are an estimate of how much the parameter
would be expected to change if it were freely estimated (Brown, 2023). The residual
correlation matrix is computed by separately converting the sample covariance
and model-implied covariance matrices to correlation matrices before calculation
differences between observed and predicted covariances (i.e., type = "cor.bollen"
).
As a rule of thumb, absolute correlation residuals greater than .10 indicate
possible evidence for poor local fit, whereas smaller correlation residuals
than 0.05 indicate negligible degree of model misfit (Maydeu-Olivares, 2017).
There is no reliable connection between the size of diagnostic statistics
(i.e., modification indices and residuals) and the type or amount of model
misspecification since (1) diagnostic statistics are themselves affected by
misspecification, (2) misspecification in one part of the model distorts estimates
in other parts of the model (i.e., error propagation), and (3) equivalent models
have identical residuals but contradict the pattern of causal effects (Kline, 2023).
Note that according to Kline' (2023) "any report of the results without information
about the residuals is deficient" (p. 172).
Returns an object of class misty.object
, which is a list with following
entries:
call |
function call |
type |
type of analysis |
data |
matrix or data frame specified in |
args |
specification of function arguments |
model |
specified model |
model.fit |
fitted lavaan object ( |
check |
results of the convergence and model identification check |
result |
list with result tables, i.e., |
The function uses the functions cfa
, lavInspect
, lavTech
,
modindices
, parameterEstimates
, and standardizedsolution
provided in the R package lavaan by Yves Rosseel (2012).
Takuya Yanagida takuya.yanagida@univie.ac.at
Brosseau-Liard, P. E., Savalei, V., & Li. L. (2012). An investigation of the sample performance of two nonnormality corrections for RMSEA, Multivariate Behavioral Research, 47, 904-930. https://doi.org/10.1080/00273171.2014.933697
Brosseau-Liard, P. E., & Savalei, V. (2014) Adjusting incremental fit indices for nonnormality. Multivariate Behavioral Research, 49, 460-470. https://doi.org/10.1080/00273171.2014.933697
Brown, T. A. (2023). Confirmatory factor analysis. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (2nd ed.) (pp. 361–379). The Guilford Press.
Kline, R. B. (2023). Principles and practice of structural equation modeling (5th ed.). Guilford Press.
Li, L., & Bentler, P. M. (2006). Robust statistical tests for evaluating the hypothesis of close fit of misspecified mean and covariance structural models. UCLA Statistics Preprint #506. University of California.
Maydeu-Olivares, A. (2017). Assessing the size of model misfit in structural equation models. Psychometrika, 82(3), 533–558. https://doi.org/10.1007/s11336-016-9552-7
Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48, 1-36. https://doi.org/10.18637/jss.v048.i02
item.alpha
, item.omega
, item.scores
## Not run:
# Load data set "HolzingerSwineford1939" in the lavaan package
data("HolzingerSwineford1939", package = "lavaan")
#----------------------------------------------------------------------------
# Measurement model with one factor
# Example 1a: Specification using the argument 'x'
item.cfa(HolzingerSwineford1939[, c("x1", "x2", "x3")])
# Example 1b: Alternative specification using the 'data' argument
item.cfa(x1:x3, data = HolzingerSwineford1939)
# Example 1c: Alternative specification using the argument 'model'
item.cfa(HolzingerSwineford1939, model = c("x1", "x2", "x3"))
# Example 1d: Alternative specification using the 'data' and 'model' argument
item.cfa(., data = HolzingerSwineford1939, model = c("x1", "x2", "x3"))
# Example 1e: Alternative specification using the argument 'model'
item.cfa(HolzingerSwineford1939, model = list(visual = c("x1", "x2", "x3")))
# Example 1f: Alternative specification using the 'data' and 'model' argument
item.cfa(., data = HolzingerSwineford1939, model = list(visual = c("x1", "x2", "x3")))
#----------------------------------------------------------------------------
# Measurement model with three factors
# Example 2: Specification using the argument 'model'
item.cfa(HolzingerSwineford1939,
model = list(visual = c("x1", "x2", "x3"),
textual = c("x4", "x5", "x6"),
speed = c("x7", "x8", "x9")))
#----------------------------------------------------------------------------
# Residual covariances
# Example 3a: One residual covariance
item.cfa(HolzingerSwineford1939,
model = list(visual = c("x1", "x2", "x3"),
textual = c("x4", "x5", "x6"),
speed = c("x7", "x8", "x9")),
rescov = c("x1", "x2"))
# Example 3b: Two residual covariances
item.cfa(HolzingerSwineford1939,
model = list(visual = c("x1", "x2", "x3"),
textual = c("x4", "x5", "x6"),
speed = c("x7", "x8", "x9")),
rescov = list(c("x1", "x2"), c("x4", "x5")))
#----------------------------------------------------------------------------
# Second-order factor model based on three first-order factors
# Example 4
item.cfa(HolzingerSwineford1939,
model = list(visual = c("x1", "x2", "x3"),
textual = c("x4", "x5", "x6"),
speed = c("x7", "x8", "x9")),
hierarch = TRUE)
#----------------------------------------------------------------------------
# Measurement model with ordered-categorical indicators
# Example 5
item.cfa(round(HolzingerSwineford1939[, c("x4", "x5", "x6")]), ordered = TRUE)
#----------------------------------------------------------------------------
# Cluster-robust standard errors
# Load data set "Demo.twolevel" in the lavaan package
data("Demo.twolevel", package = "lavaan")
# Example 6a: Specification using a variable in 'x'
item.cfa(Demo.twolevel[, c("y4", "y5", "y6", "cluster")], cluster = "cluster")
# Example 6b: Specification of the cluster variable in 'cluster'
item.cfa(Demo.twolevel[, c("y4", "y5", "y6")], cluster = Demo.twolevel$cluster)
# Example 6c: Alternative specification using the 'data' argument
item.cfa(y4:y6, data = Demo.twolevel, cluster = "cluster")
#----------------------------------------------------------------------------
# Print argument
# Example 7a: Request all results
item.cfa(HolzingerSwineford1939[, c("x1", "x2", "x3")], print = "all")
# Example 7b: Request modification indices with value equal or higher than 5
item.cfa(HolzingerSwineford1939[, c("x1", "x2", "x3", "x4")],
print = "modind", mod.minval = 5)
#----------------------------------------------------------------------------
# lavaan summary of the estimated model
# Example 8
mod <- item.cfa(HolzingerSwineford1939[, c("x1", "x2", "x3")], output = FALSE)
lavaan::summary(mod$model.fit, standardized = TRUE, fit.measures = TRUE)
#----------------------------------------------------------------------------
# Write Results
# Example 9a: Write results into a text file
item.cfa(HolzingerSwineford1939[, c("x1", "x2", "x3")], write = "CFA.txt")
# Example 9b: Write results into an Excel file
item.cfa(HolzingerSwineford1939[, c("x1", "x2", "x3")], write = "CFA.xlsx")
result <- item.cfa(HolzingerSwineford1939[, c("x1", "x2", "x3")], output = FALSE)
write.result(result, "CFA.xlsx")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.