| experiment_cor_vs_vif | R Documentation |
A dataframe summarizing 10,000 experiments comparing the output of cor_select() and vif_select(). Each row records the input sampling parameters and the resulting feature-selection metrics.
data(experiment_cor_vs_vif)
A dataframe with 10,000 rows and 6 variables:
Number of rows in the input data subset.
Number of predictors in the input data subset.
Number of predictors selected by vif_select() at the best-matching max_vif.
Maximum allowed pairwise correlation supplied to cor_select().
VIF threshold at which vif_select() produced the highest Jaccard similarity with cor_select() for the given max_cor.
Jaccard similarity between the predictors selected by cor_select() and vif_select().
The source data is a synthetic dataframe with 500 columns and 10,000 rows generated using distantia::zoo_simulate() with correlated time series (independent = FALSE).
Each iteration randomly subsets 10-50 predictors and 30-100 rows per predictor, applies cor_select() with a random max_cor threshold, then finds the max_vif value that maximizes Jaccard similarity between the two selections.
Other experiments:
experiment_adaptive_thresholds,
gam_cor_to_vif,
prediction_cor_to_vif
data(experiment_cor_vs_vif)
str(experiment_cor_vs_vif)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.