check_efa: Check Exploratory Factor Analysis Suitability

View source: R/check_efa.R

check_efaR Documentation

Check Exploratory Factor Analysis Suitability

Description

Checks if specified features in a dataframe meet criteria for performing exploratory factor analysis (EFA). This function verifies that each feature exists, is numeric, has sufficient variability, and does not have an excessive proportion of missing values. For multiple features, it also assesses the full rank of the correlation matrix and the level of intercorrelation among features.

Usage

check_efa(
  df,
  features,
  min_unique = 5,
  min_intercorrelation = 0.3,
  verbose = FALSE
)

Arguments

df

A dataframe containing the features.

features

A character vector of feature names to be evaluated.

min_unique

An integer specifying the minimum number of unique non-missing values required for a feature. Default is 5.

min_intercorrelation

A numeric threshold for the minimum acceptable intercorrelation among features. (Note: this parameter is not used explicitly in the current implementation.) Default is 0.3.

verbose

Logical; if TRUE, a confirmation message is printed when all features appear suitable. Default is FALSE.

Details

The function performs several checks:

Existence

Verifies that each feature in features is present in df.

Numeric Type

Checks that each feature is numeric.

Variability

Ensures that each feature has at least min_unique unique non-missing values.

Missing Values

Flags features with more than 20% missing values.

If more than one feature is provided, the function computes the correlation matrix (using pairwise complete observations) and checks:

Full Rank

Whether the correlation matrix is full rank. A rank lower than the number of features indicates redundancy.

Intercorrelations

Identifies features that do not have any correlation (>= 0.4) with the other features.

Value

TRUE if all features are deemed suitable for EFA, and FALSE otherwise. In the latter case, messages detailing the issues are printed.

Author(s)

E. F. Haghish

Examples

  # Example: assess feature suitability for EFA using the USJudgeRatings dataset.
  # this dataset contains ratings on several aspects of U.S. federal judges' performance.
  # Here, we check whether these rating variables are suitable for EFA.
  data("USJudgeRatings")
  features_to_check <- colnames(USJudgeRatings[,-1])
  result <- check_efa(
    df = USJudgeRatings,
    features = features_to_check,
    min_unique = 3,
    verbose = TRUE
  )

  # TRUE indicates the features are suitable.
  print(result)


HMDA documentation built on April 4, 2025, 6:06 a.m.