| auto_cor | R Documentation |
Computes the correlation matrix among a set of predictors, orders the correlation matrix according to a user-defined preference order, and removes variables one by one, taking into account the preference order, until the remaining ones are below a given Pearson correlation threshold. Warning: variables in preference.order not in colnames(x), and non-numeric columns are removed silently from x and preference.order. The same happens with rows having NA values (na.omit() is applied). The function issues a warning if zero-variance columns are found.
auto_cor( x = NULL, preference.order = NULL, cor.threshold = 0.5, verbose = TRUE )
x |
A data frame with predictors, or the result of |
preference.order |
Character vector indicating the user's order of preference to keep variables. Doesn't need to contain If not provided, variables in |
cor.threshold |
Numeric between 0 and 1, with recommended values between 0.5 and 0.9. Maximum Pearson correlation between any pair of the selected variables. Default: |
verbose |
Logical. if |
Can be chained together with auto_vif() through pipes, see the examples below.
List with three slots:
cor: correlation matrix of the selected variables.
selected.variables: character vector with the names of the selected variables.
selected.variables.df: data frame with the selected variables.
auto_vif()
if(interactive()){
#load data
data(plant_richness_df)
#on a data frame
out <- auto_cor(x = plant_richness_df[, 5:21])
#getting the correlation matrix
out$cor
#getting the names of the selected variables
out$selected.variables
#getting the data frame of selected variables
out$selected.variables.df
#on the result of auto_vif
out <- auto_vif(x = plant_richness_df[, 5:21])
out <- auto_cor(x = out)
#with pipes
out <- plant_richness_df[, 5:21] %>%
auto_vif() %>%
auto_cor()
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.