eliminate_features: Performs backwards variable selection using random forest...

Description Usage Arguments Value

View source: R/eliminate_features.R

Description

Performs backwards variable selection using random forest importance and leverages the ranger package.

Usage

1
2
3
4
5
6
7
8
eliminate_features(
  df_train,
  num_vars = 15,
  num_trees = 500,
  importance_type = "permutation",
  removal_rate = 1,
  verbose = T
)

Arguments

df_train

Training data.frame with column called "target" for selection. All columns should be numeric and prepared with a package like vtreat or using a modelpipe prep_numeric or prep_bin function call. If you are working with categorical data you should convert "target" to a factor.

num_vars

Number of variables to retain.

num_trees

Number of trees to be used in Ranger.

importance_type

Specifies importance type. Valid values are one of 'none', 'impurity', 'impurity_corrected', 'permutation'. The default is 'permutation'.

removal_rate

Number of variables to remove at a time.

verbose

TRUE prints an update each time a variable is removed.

Value

Returns a vector of selected variable names.


prescient/modelpipe documentation built on Dec. 25, 2019, 3:20 a.m.