experiment_adaptive_thresholds: Dataframe resulting from experiment to test the automatic...

experiment_adaptive_thresholdsR Documentation

Dataframe resulting from experiment to test the automatic selection of multicollinearity thresholds

Description

A dataframe summarizing 10,000 experiments validating the adaptive multicollinearity threshold system in collinear(). Each row records input data characteristics and the resulting multicollinearity metrics after filtering.

Usage

data(experiment_adaptive_thresholds)

Format

A dataframe with 10,000 rows and 9 variables:

input_rows

Number of rows in the input data subset.

input_predictors

Number of predictors in the input data subset.

output_predictors

Number of predictors retained after filtering.

input_cor_q75

75th percentile of pairwise correlations in the input data.

output_cor_q75

75th percentile of pairwise correlations in the selected predictors.

input_cor_max

Maximum pairwise correlation in the input data.

output_cor_max

Maximum pairwise correlation in the selected predictors.

input_vif_max

Maximum VIF in the input data.

output_vif_max

Maximum VIF in the selected predictors.

Details

The source data is a synthetic dataframe with 500 columns and 10,000 rows generated using distantia::zoo_simulate() with correlated time series (independent = FALSE, seasons = 0).

Each iteration randomly subsets 10-100 predictors and 30-100 rows per predictor, then applies collinear() with automatic threshold configuration to assess:

  • Whether output VIF stays bounded between ~2.5 and ~7.5

  • How the system adapts to different correlation structures

  • How predictor retention scales with input size

See Also

Other experiments: experiment_cor_vs_vif, gam_cor_to_vif, prediction_cor_to_vif

Examples

data(experiment_adaptive_thresholds)
str(experiment_adaptive_thresholds)

collinear documentation built on Dec. 8, 2025, 5:06 p.m.