kfa: Conducts k-fold cross validation for factor analysis
In kfa: K-Fold Cross Validation for Factor Analysis

View source: R/kfa.R

kfa	R Documentation

Conducts k-fold cross validation for factor analysis

Description

The function splits the data into k folds where each fold contains training data and test data. For each fold, exploratory factor analyses (EFAs) are run on the training data. The structure for each model is transformed into lavaan-compatible confirmatory factor analysis (CFA) syntax. The CFAs are then run on the test data.

Usage

kfa(
  data,
  variables = names(data),
  k = NULL,
  m = floor(length(variables)/4),
  seed = 101,
  cores = NULL,
  custom.cfas = NULL,
  power.args = list(rmsea0 = 0.05, rmseaA = 0.08),
  rotation = "oblimin",
  simple = TRUE,
  min.loading = NA,
  single.item = "none",
  ordered = FALSE,
  estimator = NULL,
  missing = "listwise",
  ...
)

Arguments

`data`	a `data.frame` containing the variables (i.e., items) to factor analyze
`variables`	character vector of column names in `data` indicating the variables to factor analyze. Default is to use all columns.
`k`	number of folds in which to split the data. Default is `NULL` which determines k via `find_k`.
`m`	integer; maximum number of factors to extract. Default is 4 items per factor.
`seed`	integer passed to `set.seed` when randomly selecting cases for each fold.
`cores`	integer; number of CPU cores to use for parallel processing. Default is `detectCores` - 1.
`custom.cfas`	a single object or named `list` of `lavaan` syntax specifying custom factor model(s).
`power.args`	named `list` of arguments to pass to `find_k` and `findRMSEAsamplesize` when conducting power analysis to determine `k`.
`rotation`	character (case-sensitive); any rotation method listed in `rotations` in the `GPArotation` package. Default is "oblimin".
`simple`	logical; Should the perfect simple structure be returned (default) when converting EFA results to CFA syntax? If `FALSE`, items can cross-load on multiple factors.
`min.loading`	numeric between 0 and 1 indicating the minimum (absolute) value of the loading for a variable on a factor when converting EFA results to CFA syntax. Must be specified when `simple = FALSE`.
`single.item`	character indicating how single-item factors should be treated. Use `"keep"` to keep them in the model when generating the CFA syntax or `"none"` (default) indicating the CFA syntax should not be generated for this model and `""` is returned.
`ordered`	logical; Should items be treated as ordinal and the polychoric correlations used in the factor analysis? When `FALSE` (default) the Pearson correlation matrix is used. A character vector of item names is also accepted to prompt estimation of the polychoric correlation matrix.
`estimator`	if `ordered = FALSE`, the default is "MLMVS". If `ordered = TRUE`, the default is "WLSMV". See `lavOptions` for other options.
`missing`	default is "listwise". See `lavOptions` for other options.
`...`	other arguments passed to `lavaan` functions. See `lavOptions`.

Details

In order for custom.cfas to be tested along with the EFA identified structures, each model supplied in custom.cfas must include all variables in lavaan-compatible syntax.

Deciding an appropriate m can be difficult, but is consequential for the possible factor structures to examine, the power analysis to determine k, and overall computation time. The n_factors function in the parameters package can assist with this decision.

When converting EFA results to CFA syntax (via efa_cfa_syntax), the simple structure is defined as each variable loading onto a single factor. This is determined using the largest factor loading for each variable. When simple = FALSE, variables are allowed to cross-load on multiple factors. In this case, all pathways with loadings above the min.loading are retained. However, allowing cross-loading variables can result in model under-identification. The efa_cfa_syntax) function conducts an identification check (i.e., identified = TRUE) and under-identified models are not run in the CFA portion of the analysis.

Value

An object of class "kfa", which is a four-element list:

cfas lavaan CFA objects for each k fold
cfa.syntax syntax used to produce CFA objects
model.names vector of names for CFA objects
efa.structures all factor structures identified in the EFA

Examples


# simulate data based on a 3-factor model with standardized loadings
sim.mod <- "f1 =~ .7*x1 + .8*x2 + .3*x3 + .7*x4 + .6*x5 + .8*x6 + .4*x7
                f2 =~ .8*x8 + .7*x9 + .6*x10 + .5*x11 + .5*x12 + .7*x13 + .6*x14
                f3 =~ .6*x15 + .5*x16 + .9*x17 + .4*x18 + .7*x19 + .5*x20
                f1 ~~ .2*f2
                f2 ~~ .2*f3
                f1 ~~ .2*f3
                x9 ~~ .2*x10"
set.seed(1161)
sim.data <- simstandard::sim_standardized(sim.mod, n = 900,
                                          latent = FALSE,
                                          errors = FALSE)[c(2:9,1,10:20)]

# include a custom 2-factor model
custom2f <- paste0("f1 =~ ", paste(colnames(sim.data)[1:10], collapse = " + "),
                   "\nf2 =~ ",paste(colnames(sim.data)[11:20], collapse = " + "))


mods <- kfa(data = sim.data,
            k = NULL, # prompts power analysis to determine number of folds
            cores = 2,
            custom.cfas = custom2f)

kfa documentation built on July 9, 2023, 5:44 p.m.