HULL: Hull method for determining the number of factors to retain
In mdsteiner/EFAdiff: Fast and Flexible Implementations of Exploratory Factor Analysis Tools

View source: R/HULL.R

HULL	R Documentation

Hull method for determining the number of factors to retain

Description

Implementation of the Hull method suggested by Lorenzo-Seva, Timmerman, and Kiers (2011), with an extension to principal axis factoring. See details for parallelization.

Usage

HULL(
  x,
  N = NA,
  n_fac_theor = NA,
  method = c("PAF", "ULS", "ML"),
  gof = c("CAF", "CFI", "RMSEA"),
  eigen_type = c("SMC", "PCA", "EFA"),
  use = c("pairwise.complete.obs", "all.obs", "complete.obs", "everything",
    "na.or.complete"),
  cor_method = c("pearson", "spearman", "kendall"),
  n_datasets = 1000,
  percent = 95,
  decision_rule = c("means", "percentile", "crawford"),
  n_factors = 1,
  ...
)

Arguments

`x`	matrix or data.frame. Dataframe or matrix of raw data or matrix with correlations.
`N`	numeric. Number of cases in the data. This is passed to PARALLEL. Only has to be specified if x is a correlation matrix, otherwise it is determined based on the dimensions of x.
`n_fac_theor`	numeric. Theoretical number of factors to retain. The maximum of this number and the number of factors suggested by PARALLEL plus one will be used in the Hull method.
`method`	character. The estimation method to use. One of `"PAF"`, `"ULS"`, or `"ML"`, for principal axis factoring, unweighted least squares, and maximum likelihood, respectively.
`gof`	character. The goodness of fit index to use. Either `"CAF"`, `"CFI"`, or `"RMSEA"`, or any combination of them. If `method = "PAF"` is used, only the CAF can be used as goodness of fit index. For details on the CAF, see Lorenzo-Seva, Timmerman, and Kiers (2011).
`eigen_type`	character. On what the eigenvalues should be found in the parallel analysis. Can be one of `"SMC"`, `"PCA"`, or `"EFA"`. If using `"SMC"` (default), the diagonal of the correlation matrices is replaced by the squared multiple correlations (SMCs) of the indicators. If using `"PCA"`, the diagonal values of the correlation matrices are left to be 1. If using `"EFA"`, eigenvalues are found on the correlation matrices with the final communalities of an EFA solution as diagonal. This is passed to `PARALLEL`.
`use`	character. Passed to `stats::cor` if raw data is given as input. Default is `"pairwise.complete.obs"`.
`cor_method`	character. Passed to `stats::cor`. Default is `"pearson"`.
`n_datasets`	numeric. The number of datasets to simulate. Default is 1000. This is passed to `PARALLEL`.
`percent`	numeric. A vector of percentiles to take the simulated eigenvalues from. Default is 95. This is passed to `PARALLEL`.
`decision_rule`	character. Which rule to use to determine the number of factors to retain. Default is `"means"`, which will use the average simulated eigenvalues. `"percentile"`, uses the percentiles specified in percent. `"crawford"` uses the 95th percentile for the first factor and the mean afterwards (based on Crawford et al, 2010). This is passed to `PARALLEL`.
`n_factors`	numeric. Number of factors to extract if `"EFA"` is included in `eigen_type`. Default is 1. This is passed to `PARALLEL`.
`...`	Further arguments passed to `EFA`, also in `PARALLEL`.

Details

The Hull method aims to find a model with an optimal balance between model fit and number of parameters. That is, it aims to retrieve only major factors (Lorenzo-Seva, Timmerman, & Kiers, 2011). To this end, it performs the following steps (Lorenzo-Seva, Timmerman, & Kiers, 2011, p.351):

It performs parallel analysis and adds one to the identified number of factors (this number is denoted J). J is taken as an upper bound of the number of factors to retain in the hull method. Alternatively, a theoretical number of factors can be entered. In this case J will be set to whichever of these two numbers (from parallel analysis or based on theory) is higher.
For all 0 to J factors, the goodness-of-fit (one of CAF, RMSEA, or CFI) and the degrees of freedom (df) are computed.
The solutions are ordered according to their df.
Solutions that are not on the boundary of the convex hull are eliminated (see Lorenzo-Seva, Timmerman, & Kiers, 2011, for details).
All the triplets of adjacent solutions are considered consecutively. The middle solution is excluded if its point is below or on the line connecting its neighbors in a plot of the goodness-of-fit versus the degrees of freedom.
Step 5 is repeated until no solution can be excluded.
The st values of the “hull” solutions are determined.
The solution with the highest st value is selected.

The PARALLEL function and the principal axis factoring of the different number of factors can be parallelized using the future framework, by calling the future::plan function. The examples provide example code on how to enable parallel processing.

Note that if gof = "RMSEA" is used, 1 - RMSEA is actually used to compare the different solutions. Thus, the threshold of .05 is then .95. This is necessary due to how the heuristic to locate the elbow of the hull works.

The ML estimation method uses the stats::factanal starting values. See also the EFA documentation.

The HULL function can also be called together with other factor retention criteria in the N_FACTORS function.

Value

A list of class HULL containing the following objects

`n_fac_CAF`	The number of factors to retain according to the Hull method with the CAF.
`n_fac_CFI`	The number of factors to retain according to the Hull method with the CFI.
`n_fac_RMSEA`	The number of factors to retain according to the Hull method with the RMSEA.
`solutions_CAF`	A matrix containing the CAFs, degrees of freedom, and for the factors lying on the hull, the st values of the hull solution (see Lorenzo-Seva, Timmerman, and Kiers 2011 for details).
`solutions_CFI`	A matrix containing the CFIs, degrees of freedom, and for the factors lying on the hull, the st values of the hull solution (see Lorenzo-Seva, Timmerman, and Kiers 2011 for details).
`solutions_RMSEA`	A matrix containing the RMSEAs, degrees of freedom, and for the factors lying on the hull, the st values of the hull solution (see Lorenzo-Seva, Timmerman, and Kiers 2011 for details).
`n_fac_max`	The upper bound J of the number of factors to extract (see details).
`settings`	A list of the settings used.

Source

Lorenzo-Seva, U., Timmerman, M. E., & Kiers, H. A. (2011). The Hull method for selecting the number of common factors. Multivariate Behavioral Research, 46(2), 340-364.

Examples


# using PAF (this will throw a warning if gof is not specified manually
# and CAF will be used automatically)
HULL(test_models$baseline$cormat, N = 500, gof = "CAF")

# using ML with all available fit indices (CAF, CFI, and RMSEA)
HULL(test_models$baseline$cormat, N = 500, method = "ML")

# using ULS with only RMSEA
HULL(test_models$baseline$cormat, N = 500, method = "ULS", gof = "RMSEA")


## Not run: 
# using parallel processing (Note: plans can be adapted, see the future
# package for details)
future::plan(future::multisession)
HULL(test_models$baseline$cormat, N = 500, gof = "CAF")

## End(Not run)

mdsteiner/EFAdiff documentation built on June 13, 2025, 4:05 p.m.