CD | R Documentation |
Factor retention method introduced by Ruscio and Roche (2012). The code was adapted from the CD code by Auerswald and Moshagen (2017) available at https://osf.io/x5cz2/?view_only=d03efba1fd0f4c849a87db82e6705668
CD( x, n_factors_max = NA, N_pop = 10000, N_samples = 500, alpha = 0.3, use = c("pairwise.complete.obs", "all.obs", "complete.obs", "everything", "na.or.complete"), cor_method = c("pearson", "spearman", "kendall"), max_iter = 50 )
x |
data.frame or matrix. Dataframe or matrix of raw data. |
n_factors_max |
numeric. The maximum number of factors to test against. Larger numbers will increase the duration the procedure takes, but test more possible solutions. If left NA (default) the maximum number of factors for which the model is still over-identified (df > 0) is used. |
N_pop |
numeric. Size of finite populations of comparison data. Default is 10000. |
N_samples |
numeric. Number of samples drawn from each population. Default is 500. |
alpha |
numeric. The alpha level used to test the significance of the improvement added by an additional factor. Default is .30. |
use |
character. Passed to |
cor_method |
character. Passed to |
max_iter |
numeric. The maximum number of iterations to perform after which the iterative PAF procedure is halted. Default is 50. |
"Parallel analysis (PA) is an effective stopping rule that compares the eigenvalues of randomly generated data with those for the actual data. PA takes into account sampling error, and at present it is widely considered the best available method. We introduce a variant of PA that goes even further by reproducing the observed correlation matrix rather than generating random data. Comparison data (CD) with known factorial structure are first generated using 1 factor, and then the number of factors is increased until the reproduction of the observed eigenvalues fails to improve significantly" (Ruscio & Roche, 2012, p. 282).
The CD implementation here is based on the code by Ruscio and Roche (2012), but is slightly adapted to increase speed by performing the principal axis factoring using a C++ based function.
Note that if the data contains missing values, these will be removed for the
comparison data procedure using stats::na.omit
. If
missing data should be treated differently, e.g., by imputation, do this outside
CD
and then pass the complete data.
The CD
function can also be called together with other factor retention
criteria in the N_FACTORS
function.
A list of class CD containing
n_factors |
The number of factors to retain according to comparison data results. |
eigenvalues |
A vector containing the eigenvalues of the entered data. |
RMSE_eigenvalues |
A matrix containing the RMSEs between the eigenvalues of the generated data and those of the entered data. |
settings |
A list of the settings used. |
Auerswald, M., & Moshagen, M. (2019). How to determine the number of factors to retain in exploratory factor analysis: A comparison of extraction methods under realistic conditions. Psychological Methods, 24(4), 468–491. https://doi.org/10.1037/met0000200
Ruscio, J., & Roche, B. (2012). Determining the number of factors to retain in an exploratory factor analysis using comparison data of known factorial structure. Psychological Assessment, 24, 282–292. doi: 10.1037/a0025697
Other factor retention criteria: EKC
,
HULL
, KGC
, PARALLEL
, SMT
N_FACTORS
as a wrapper function for this and all
the above-mentioned factor retention criteria.
# determine n factors of the GRiPS CD(GRiPS_raw) # determine n factors of the DOSPERT risk subscale CD(DOSPERT_raw)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.