hmda.efa: Perform Exploratory Factor Analysis with HMDA

View source: R/hmda.efa.R

hmda.efaR Documentation

Perform Exploratory Factor Analysis with HMDA

Description

Performs exploratory factor analysis (EFA) on a specified set of features from a data frame using the psych package. The function optionally runs parallel analysis to recommend the number of factors, applies a rotation method, reverses specified features, and cleans up factor loadings by zeroing out values below a threshold. It then computes factor scores and reliability estimates, and finally returns a list containing the EFA results, cleaned loadings, reliability metrics, and factor correlations.

Usage

hmda.efa(
  df,
  features,
  algorithm = "minres",
  rotation = "promax",
  parallel.analysis = TRUE,
  nfactors = NULL,
  dict = dictionary(df, attribute = "label"),
  minimum_loadings = 0.3,
  exclude_features = NULL,
  ignore_binary = TRUE,
  intercorrelation = 0.3,
  reverse_features = NULL,
  plot = FALSE,
  factor_names = NULL,
  verbose = TRUE
)

Arguments

df

A data frame containing the items for EFA.

features

A vector of feature names (or indices) in df to include in the factor analysis.

algorithm

Character. The factor extraction method to use. Default is "minres". Other methods supported by psych (e.g., "ml", "minchi") may also be used.

rotation

Character. The rotation method to apply to the factor solution. Default is "promax".

parallel.analysis

Logical. If TRUE, runs parallel analysis using psych::fa.parallel to recommend the number of factors. Default is TRUE.

nfactors

Integer. The number of factors to extract. If NULL and parallel.analysis = TRUE, the number of factors recommended by the parallel analysis is used.

dict

A data frame dictionary with at least two columns: "name" and "description". Used to replace feature names with human-readable labels. Default is dictionary(df, attribute = "label").

minimum_loadings

Numeric. Any factor loading with an absolute value lower than this threshold is set to zero. Default is 0.30.

exclude_features

Character vector. Features to exclude from the analysis. Default is NULL.

ignore_binary

Logical. If TRUE, binary items may be ignored in the analysis. Default is TRUE.

intercorrelation

Numeric. (Unused in current version) Intended to set a minimum intercorrelation threshold between items. Default is 0.3.

reverse_features

A vector of feature names for which the scoring should be reversed prior to analysis. Default is NULL.

plot

Logical. If TRUE, a factor diagram is plotted using psych::fa.diagram. Default is FALSE.

factor_names

Character vector. Optional names to assign to the extracted factors (i.e., new column names for loadings).

verbose

Logical. If TRUE, the factor loadings are printed in the console.

Details

This function first checks that the number of factors is either provided or determined via parallel analysis (if parallel.analysis is TRUE). A helper function trans() is defined to reverse and standardize item scores for features specified in reverse_features. Unwanted features can be excluded via exclude_features. The EFA is then performed using psych::fa() with the chosen extraction algorithm and rotation method. Loadings are cleaned by zeroing out values below the minimum_loadings threshold, rounded, and sorted. Factor scores are computed with psych::factor.scores() and reliability is estimated using the omega() function. Finally, factor correlations are extracted from the EFA object.

Value

A list with the following components:

parallel.analysis

The output from the parallel analysis, if run.

efa

The full exploratory factor analysis object returned by psych::fa.

efa_loadings

A matrix of factor loadings after zeroing out values below the minimum_loadings threshold, rounded and sorted.

efa_reliability

The reliability results (omega) computed from the factor scores.

factor_correlations

A matrix of factor correlations, rounded to 2 decimal places.

Author(s)

E. F. Haghish

Examples

  # Example: assess feature suitability for EFA using the USJudgeRatings dataset.
  # this dataset contains ratings on several aspects of U.S. federal judges' performance.
  # Here, we check whether these rating variables are suitable for EFA.
  data("USJudgeRatings")
  features_to_check <- colnames(USJudgeRatings[,-1])
  result <- check_efa(
    df = USJudgeRatings,
    features = features_to_check,
    min_unique = 3,
    verbose = TRUE
  )

  # TRUE indicates the features are suitable.
  print(result)


HMDA documentation built on April 4, 2025, 6:06 a.m.