View source: R/avoid_multicollinearity.R
| remove_bycorrvif | R Documentation |
Removing variables using ViF and correlation
remove_bycorrvif(fmla, data, corrthresh, vifthresh, centrescalemains = FALSE)
fmla |
A model formula, specifies a possible set of main effects |
data |
A data frame to extract a the main effects from |
corrthresh |
A threshold.
The variable with the highest correlation, and appearing later in the model matrix,
is removed until there are no pairwise correlations above |
vifthresh |
A threshold. The variable with the highest ViF is removed until no variables have ViF above |
centrescalemains |
If TRUE then |
The function first removes variables based on pairwise correlation, and then based on ViF.
Variables are removed one at a time.
First a variable is removed due to having high correlation, then pairwise correlation is recomputed.
This is repeated until no pairwise correlations are above the threshold corrthresh.
Then generalised Variance Inflation Factors (ViF) are computed using car::vif().
The variable with the highest ViF is removed and ViFs are recomputed.
This is repeated until there are no ViFs higher than vifthresh.
indata <- readRDS("./private/data/clean/7_2_10_input_data.rds")
remove_bycorrvif("~ AnnMeanTemp + AnnPrec + MaxTWarmMonth + PrecWarmQ +
MinTColdMonth + PrecColdQ + PrecSeasonality + longitude * latitude",
data = indata$insampledata$Xocc,
corrthresh = 0.9,
vifthresh = 30)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.