remove_redundant_vars: Remove redundant variables

remove_redundant_varsR Documentation

Remove redundant variables

Description

Remove redundant variables from a data.frame based on a threshold value. This is done by calculating all the intercorrelations, then finding those that correlate at or above the threshold (absolute value), then removing the second pair of each variable and not removing more variables than strictly necessary.

Usage

remove_redundant_vars(
  df,
  threshold = 0.9,
  cor_method = "pearson",
  messages = T
)

Arguments

df

(data.frame) A data.frame with numeric variables.

threshold

(numeric scalar) A threshold above which intercorrelations are removed. Defaults to .9.

cor_method

(character scalar) The correlation method to use. Parameter is fed to cor(). Defaults to "pearson".

messages

(boolean) Whether to print diagnostic messages.

Examples

remove_redundant_vars(iris[-5]) %>% head

Deleetdk/kirkegaard documentation built on May 8, 2024, 12:27 a.m.