EFA: Exploratory factor analysis (EFA)

View source: R/EFA.R

EFAR Documentation

Exploratory factor analysis (EFA)

Description

This function does an EFA with either PAF, ML, or ULS with or without subsequent rotation. All arguments with default value NA can be left to default if type is set to one of "EFAtools", "SPSS", or "psych". The respective specifications are then handled according to the specified type (see details). For all rotations except varimax and promax, the GPArotation package is needed.

Usage

EFA(
  x,
  n_factors,
  N = NA,
  method = c("PAF", "ML", "ULS"),
  rotation = c("none", "varimax", "equamax", "quartimax", "geominT", "bentlerT",
    "bifactorT", "promax", "oblimin", "quartimin", "simplimax", "bentlerQ", "geominQ",
    "bifactorQ"),
  type = c("EFAtools", "psych", "SPSS", "none"),
  max_iter = NA,
  init_comm = NA,
  criterion = NA,
  criterion_type = NA,
  abs_eigen = NA,
  use = c("pairwise.complete.obs", "all.obs", "complete.obs", "everything",
    "na.or.complete"),
  varimax_type = NA,
  k = NA,
  normalize = TRUE,
  P_type = NA,
  precision = 1e-05,
  order_type = NA,
  start_method = "psych",
  cor_method = c("pearson", "spearman", "kendall"),
  ...
)

Arguments

x

data.frame or matrix. Dataframe or matrix of raw data or matrix with correlations. If raw data is entered, the correlation matrix is found from the data.

n_factors

numeric. Number of factors to extract.

N

numeric. The number of observations. Needs only be specified if a correlation matrix is used. If input is a correlation matrix and N = NA (default), not all fit indices can be computed.

method

character. One of "PAF", "ML", or "ULS" to use principal axis factoring, maximum likelihood, or unweighted least squares (also called minres), respectively, to fit the EFA.

rotation

character. Either perform no rotation ("none"; default), an orthogonal rotation ("varimax", "equamax", "quartimax", "geominT", "bentlerT", or "bifactorT"), or an oblique rotation ("promax", "oblimin", "quartimin", "simplimax", "bentlerQ", "geominQ", or "bifactorQ").

type

character. If one of "EFAtools" (default), "psych", or "SPSS" is used, and the following arguments with default NA are left with NA, these implementations are executed according to the respective program ("psych" and "SPSS") or according to the best solution found in Grieder & Steiner (2020; "EFAtools"). Individual properties can be adapted using one of the three types and specifying some of the following arguments. If set to "none" additional arguments must be specified depending on the method and rotation used (see details).

max_iter

numeric. The maximum number of iterations to perform after which the iterative PAF procedure is halted with a warning. If type is one of "EFAtools", "SPSS", or "psych", this is automatically specified if max_iter is left to be NA, but can be overridden by entering a number. Default is NA.

init_comm

character. The method to estimate the initial communalities in PAF. "smc" will use squared multiple correlations, "mac" will use maximum absolute correlations, "unity" will use 1s (see details). Default is NA.

criterion

numeric. The convergence criterion used for PAF. If the change in communalities from one iteration to the next is smaller than this criterion the solution is accepted and the procedure ends. Default is NA.

criterion_type

character. Type of convergence criterion used for PAF. "max_individual" selects the maximum change in any of the communalities from one iteration to the next and tests it against the specified criterion. This is also used by SPSS. "sum" takes the difference of the sum of all communalities in one iteration and the sum of all communalities in the next iteration and tests this against the criterion. This procedure is used by the psych::fa function. Default is NA.

abs_eigen

logical. Which algorithm to use in the PAF iterations. If FALSE, the loadings are computed from the eigenvalues. This is also used by the psych::fa function. If TRUE the loadings are computed with the absolute eigenvalues as done by SPSS. Default is NA.

use

character. Passed to stats::cor if raw data is given as input. Default is "pairwise.complete.obs".

varimax_type

character. The type of the varimax rotation performed. If "svd", singular value decomposition is used, as stats::varimax does. If "kaiser", the varimax procedure performed in SPSS is used. This is the original procedure from Kaiser (1958), but with slight alterations in the varimax criterion (see details, and Grieder & Steiner, 2020). Default is NA.

k

numeric. Either the power used for computing the target matrix P in the promax rotation or the number of 'close to zero loadings' for the simplimax rotation (see GPArotation::GPFoblq). If left to NA (default), the value for promax depends on the specified type. For simplimax, nrow(L), where L is the matrix of unrotated loadings, is used by default.

normalize

logical. If TRUE, a kaiser normalization is performed before the specified rotation. Default is TRUE.

P_type

character. This specifies how the target matrix P is computed in promax rotation. If "unnorm" it will use the unnormalized target matrix as originally done in Hendrickson and White (1964). This is also used in the psych and stats packages. If "norm" it will use the normalized target matrix as used in SPSS. Default is NA.

precision

numeric. The tolerance for stopping in the rotation procedure. Default is 10^-5 for all rotation methods.

order_type

character. How to order the factors. "eigen" will reorder the factors according to the largest to lowest eigenvalues of the matrix of rotated loadings. "ss_factors" will reorder the factors according to descending sum of squared factor loadings per factor. Default is NA.

start_method

character. How to specify the starting values for the optimization procedure for ML. Default is "psych" which takes the starting values specified in psych::fa. "factanal" takes the starting values specified in the stats::factanal function. Solutions are very similar.

cor_method

character. Passed to stats::cor. Default is "pearson".

...

Additional arguments passed to rotation functions from the GPArotation package (e.g., maxit for maximum number of iterations).

Details

There are two main ways to use this function. The easiest way is to use it with a specified type (see above), which sets most of the other arguments accordingly. Another way is to use it more flexibly by explicitly specifying all arguments used and set type to "none" (see examples). A mix of the two can also be done by specifying a type as well as additional arguments. However, this will throw warnings to avoid unintentional deviations from the implementations according to the specified type.

The type argument is evaluated for PAF and for all rotations (mainly important for the varimax and promax rotations). The type-specific settings for these functions are detailed below.

For PAF, the values of init_comm, criterion, criterion_type, and abs_eigen depend on the type argument.

type = "EFAtools" will use the following argument specification: init_comm = "smc", criterion = .001, criterion_type = "sum", abs_eigen = TRUE.

type = "psych" will use the following argument specification: init_comm = "smc", criterion = .001, criterion_type = "sum", abs_eigen = FALSE.

type = "SPSS" will use the following argument specification: init_comm = "smc", criterion = .001, criterion_type = "max_individual", abs_eigen = TRUE.

If SMCs fail, SPSS takes "mac". However, as SPSS takes absolute eigenvalues, this is hardly ever the case. Psych, on the other hand, takes "unity" if SMCs fail, but uses the Moore-Penrose Psudo Inverse of a matrix, thus, taking "unity" is only necessary if negative eigenvalues occur afterwards in the iterative PAF procedure. The EFAtools type setting combination was the best in terms of accuracy and number of Heywood cases compared to all the other setting combinations tested in simulation studies in Grieder & Steiner (2020), which is why this type is used as a default here.

For varimax, the values of varimax_type and order_type depend on the type argument.

type = "EFAtools" will use the following argument specification: varimax_type = "kaiser", order_type = "eigen".

type = "psych" will use the following argument specification: varimax_type = "svd", order_type = "eigen".

type = "SPSS" will use the following argument specification: varimax_type = "kaiser", order_type = "ss_factors".

For promax, the values of P_type, order_type, and k depend on the type argument.

type = "EFAtools" will use the following argument specification: P_type = "norm", order_type = "eigen", k = 4.

type = "psych" will use the following argument specification: P_type = "unnorm", order_type = "eigen", k = 4.

type = "SPSS" will use the following argument specification: P_type = "norm", order_type = "ss_factors", k = 4.

The P_type argument can take two values, "unnorm" and "norm". It controls which formula is used to compute the target matrix P in the promax rotation. "unnorm" uses the formula from Hendrickson and White (1964), specifically: P = abs(A^(k + 1)) / A, where A is the unnormalized matrix containing varimax rotated loadings. "SPSS" uses the normalized varimax rotated loadings. Specifically it used the following formula, which can be found in the SPSS 23 and SPSS 27 Algorithms manuals: P = abs(A / sqrt(rowSums(A^2))) ^(k + 1) * (sqrt(rowSums(A^2)) / A). As for PAF, the EFAtools type setting combination for promax was the best compared to the other setting combinations tested in simulation studies in Grieder & Steiner (2020).

The varimax_type argument can take two values, "svd", and "kaiser". "svd" uses singular value decomposition, by calling stats::varimax. "kaiser" performs the varimax procedure as described in the SPSS 23 Algorithms manual and as described by Kaiser (1958). However, there is a slight alteration in computing the varimax criterion, which we found to better align with the results obtain from SPSS. Specifically, the original varimax criterion as described in the SPSS 23 Algorithms manual is sum(n*colSums(lambda ^ 4) - colSums(lambda ^ 2) ^ 2) / n ^ 2, where n is the number of indicators, and lambda is the rotated loadings matrix. However, we found the following to produce results more similar to those of SPSS: sum(n*colSums(abs(lambda)) - colSums(lambda ^ 4) ^ 2) / n^2.

For all other rotations except varimax and promax, the type argument only controls the order_type argument with the same values as stated above for the varimax and promax rotations. For these other rotations, the GPArotation package is needed. Additional arguments can also be specified and will be passed to the respective GPArotation function (e.g., maxit to change the maximum number of iterations for the rotation procedure).

The type argument has no effect on ULS and ML. For ULS, no additional arguments are needed. For ML, an additional argument start_method is needed to determine the starting values for the optimization procedure. Default for this argument is "factanal" which takes the starting values specified in the stats::factanal function.

Value

A list of class EFA containing (a subset of) the following:

orig_R

Original correlation matrix.

h2_init

Initial communality estimates from PAF.

h2

Final communality estimates from the unrotated solution.

orig_eigen

Eigen values of the original correlation matrix.

init_eigen

Initial eigenvalues, obtained from the correlation matrix with the initial communality estimates as diagonal in PAF.

final_eigen

Eigenvalues obtained from the correlation matrix with the final communality estimates as diagonal.

iter

The number of iterations needed for convergence.

convergence

Integer code for convergence as returned by stats:optim (only for ML and ULS). 0 indicates successful completion.

unrot_loadings

Loading matrix containing the final unrotated loadings.

vars_accounted

Matrix of explained variances and sums of squared loadings. Based on the unrotated loadings.

fit_indices

For ML and ULS: Fit indices derived from the unrotated factor loadings: Chi Square, including significance level, degrees of freedom (df), Comparative Fit Index (CFI), Root Mean Square Error of Approximation (RMSEA), including its 90% confidence interval, Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and the common part accounted for (CAF) index as proposed by Lorenzo-Seva, Timmerman, & Kiers (2011). For PAF, only the CAF and dfs are returned.

rot_loadings

Loading matrix containing the final rotated loadings (pattern matrix).

Phi

The factor intercorrelations (only for oblique rotations).

Structure

The structure matrix (only for oblique rotations).

rotmat

The rotation matrix.

vars_accounted_rot

Matrix of explained variances and sums of squared loadings. Based on rotated loadings and, for oblique rotations, the factor intercorrelations.

settings

A list of the settings used.

Source

Grieder, S., & Steiner, M.D. (2020). Algorithmic Jingle Jungle: A Comparison of Implementations of Principal Axis Factoring and Promax Rotation in R and SPSS. Manuscript in Preparation.

Hendrickson, A. E., & White, P. O. (1964). Promax: A quick method for rotation to oblique simple structure. British Journal of Statistical Psychology, 17 , 65–70. doi: 10.1111/j.2044-8317.1964.tb00244.x

Lorenzo-Seva, U., Timmerman, M. E., & Kiers, H. A. L. (2011). The Hull Method for Selecting the Number of Common Factors, Multivariate Behavioral Research, 46, 340-364, doi: 10.1080/00273171.2011.564527

Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, 187–200. doi: 10.1007/BF02289233

Examples

# A type EFAtools (as presented in Steiner and Grieder, 2020) EFA
EFAtools_PAF <- EFA(test_models$baseline$cormat, n_factors = 3, N = 500,
                    type = "EFAtools", method = "PAF", rotation = "none")

# A type SPSS EFA to mimick the SPSS implementation (this will throw a warning,
# see below)
SPSS_PAF <- EFA(test_models$baseline$cormat, n_factors = 3, N = 500,
                type = "SPSS", method = "PAF", rotation = "none")

# A type psych EFA to mimick the psych::fa() implementation
psych_PAF <- EFA(test_models$baseline$cormat, n_factors = 3, N = 500,
                 type = "psych", method = "PAF", rotation = "none")

# Use ML instead of PAF with type EFAtools
EFAtools_ML <- EFA(test_models$baseline$cormat, n_factors = 3, N = 500,
                   type = "EFAtools", method = "ML", rotation = "none")

# Use oblimin rotation instead of no rotation with type EFAtools
EFAtools_oblim <- EFA(test_models$baseline$cormat, n_factors = 3, N = 500,
                      type = "EFAtools", method = "PAF", rotation = "oblimin")

# Do a PAF without rotation without specifying a type, so the arguments
# can be flexibly specified (this is only recommended if you know what your
# doing)
PAF_none <- EFA(test_models$baseline$cormat, n_factors = 3, N = 500,
                type = "none", method = "PAF", rotation = "none",
                max_iter = 500, init_comm = "mac", criterion = 1e-4,
                criterion_type = "sum", abs_eigen = FALSE)

# Add a promax rotation
PAF_pro <- EFA(test_models$baseline$cormat, n_factors = 3, N = 500,
               type = "none", method = "PAF", rotation = "promax",
               max_iter = 500, init_comm = "mac", criterion = 1e-4,
               criterion_type = "sum", abs_eigen = FALSE, k = 3,
               P_type = "unnorm", precision= 1e-5, order_type = "eigen",
               varimax_type = "svd")


EFAtools documentation built on Jan. 6, 2023, 5:16 p.m.