auto_cor | R Documentation |
Computes the correlation matrix among a set of predictors, orders the correlation matrix according to a user-defined preference order, and removes variables one by one, taking into account the preference order, until the remaining ones are below a given Pearson correlation threshold. Warning: variables in preference.order
not in colnames(x)
, and non-numeric columns are removed silently from x
and preference.order
. The same happens with rows having NA values (na.omit()
is applied). The function issues a warning if zero-variance columns are found.
auto_cor( x = NULL, preference.order = NULL, cor.threshold = 0.5, verbose = TRUE )
x |
A data frame with predictors, or the result of |
preference.order |
Character vector indicating the user's order of preference to keep variables. Doesn't need to contain If not provided, variables in |
cor.threshold |
Numeric between 0 and 1, with recommended values between 0.5 and 0.9. Maximum Pearson correlation between any pair of the selected variables. Default: |
verbose |
Logical. if |
Can be chained together with auto_vif()
through pipes, see the examples below.
List with three slots:
cor
: correlation matrix of the selected variables.
selected.variables
: character vector with the names of the selected variables.
selected.variables.df
: data frame with the selected variables.
auto_vif()
if(interactive()){ #load data data(plant_richness_df) #on a data frame out <- auto_cor(x = plant_richness_df[, 5:21]) #getting the correlation matrix out$cor #getting the names of the selected variables out$selected.variables #getting the data frame of selected variables out$selected.variables.df #on the result of auto_vif out <- auto_vif(x = plant_richness_df[, 5:21]) out <- auto_cor(x = out) #with pipes out <- plant_richness_df[, 5:21] %>% auto_vif() %>% auto_cor() }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.