non_collinear_vars: Select a set of predictors with minimal multicollinearity

View source: R/non_collinear_vars.R

non_collinear_varsR Documentation

Select a set of predictors with minimal multicollinearity

Description

[Stable]

Select a set of predictors with minimal multicollinearity using the variance inflation factor (VIF) as criteria to remove collinear variables. The algorithm will: (i) compute the VIF value of the correlation matrix containing the variables selected in ...; (ii) arrange the VIF values and delete the variable with the highest VIF; and (iii) iterate step ii until VIF value is less than or equal to max_vif.

Usage

non_collinear_vars(
  .data,
  ...,
  max_vif = 10,
  missingval = "pairwise.complete.obs"
)

Arguments

.data

The data set containing the variables.

...

Variables to be submitted to selection. If ... is null then all the numeric variables from .data are used. It must be a single variable name or a comma-separated list of unquoted variables names.

max_vif

The maximum value for the Variance Inflation Factor (threshold) that will be accepted in the set of selected predictors.

missingval

How to deal with missing values. For more information, please see stats::cor().

Value

A data frame showing the number of selected predictors, maximum VIF value, condition number, determinant value, selected predictors and removed predictors from the original set of variables.

Examples


library(metan)
# All numeric variables
non_collinear_vars(data_ge2)

# Select variables and choose a VIF threshold to 5
non_collinear_vars(data_ge2, EH, CL, CW, KW, NKE, max_vif = 5)


metan documentation built on March 7, 2023, 5:34 p.m.