select_predictors: Exclude highly correlated predictors

Description Usage Arguments Details Value Examples

View source: R/utils.R

Description

Exclude highly correlated predictors

Usage

1
2
3
select_predictors(response_vector, predictors_matrix,
  response_lower_thresh = 0.1, predictors_upper_thresh = 0.75,
  threads = 1, verbose = FALSE)

Arguments

response_vector

a numeric vector (the length should be equal to the rows of the predictors_matrix parameter)

predictors_matrix

a numeric matrix (the rows should be equal to the length of the response_vector parameter)

response_lower_thresh

a numeric value. This parameter allows the user to keep all the predictors having a correlation with the response greater than the response_lower_thresh value.

predictors_upper_thresh

a numeric value. This parameter allows the user to keep all the predictors having a correlation comparing to the other predictors less than the predictors_upper_thresh value.

threads

a numeric value specifying the number of cores to run in parallel

verbose

either TRUE or FALSE. If TRUE then information will be printed out in the R session.

Details

The function works in the following way : The correlation of the predictors with the response is first calculated and the resulted correlations are sorted in decreasing order. Then iteratively predictors with correlation higher than the predictors_upper_thresh value are removed by favoring those predictors which are more correlated with the response variable. If the response_lower_thresh value is greater than 0.0 then only predictors having a correlation higher than or equal to the response_lower_thresh value will be kept, otherwise they will be excluded. This function returns the indices of the predictors and is useful in case of multicollinearity.

Value

a vector of column-indices

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
library(textTinyR)

set.seed(1)
resp = runif(100)

set.seed(2)
col = runif(100)

matr = matrix(c(col, col^4, col^6, col^8, col^10), nrow = 100, ncol = 5)

out = select_predictors(resp, matr, predictors_upper_thresh = 0.75)

textTinyR documentation built on July 25, 2018, 5:03 p.m.