VariableSelection-package: Identify the set of predictors of a binary outcome for...

Description Details Author(s)

Description

Several methods for variable selection in diagnostic/prognostic classification have been proposed in the literature. This package enables users to identify the appropriate set of predictors of a binary outcome of interest for diagnostic/ptognostic situations. Various approaches for selecting variables have been implemented, including Filter andWrapper approaches. Numerical and graphical output can be easily obtained. The package attempts to provide user-friendly outputs for those who are interested only in the results.

Details

Package: VariableSelection
Type: Package
Version: 1.0-0
Date: 2015-10-25
License: GPL-2

Selecting a small number of highly predictive variables generally helps to avoid over fitting and results in more efficient, applicable and comprehensible model. A large number of variable selection methods have been developed; nevertheless, there is no general consensus on the one which performs well under all conditions. Thus, for each given dataset, the best method should be chosen specifically. Unfortunately, due to unavailability of a lot of these methods in user-friendly programs, applied researchers mostly resort to easy-to-run methods and many of the effective methods remain unused. The VariableSelection package has been designed to incorporate many such methods with a clear and user-friendly output for the end-user. The only requirement to run the command is a data frame with diagnostic/prognostic predictors/markers and the outcome as columns and records/cases/individuals/patients as the rows. As the applied researcher gets more familiar with the package and the methodology, s/he can tune various available parameters used in methods. Although the package was developed for clinical practice in the first place, it can be applied to data sets from all fields. The most important functions in the package are the variable.selection(), control.selection(), print.variable.selection() and plot.variable.selection() functions. The variable.selection() function select the appropriate sets of variables according to the specified variable selection methods. More than one method can be chosen for selecting the set of variables. The control.selection() function is used to set several parameters that are specific of each method, such as number of trees in the Random Forest approach or the classification model for which the variables are selected. The print.variable.selection() and plot.variable.selection() functions produce visual friendly outputs.

Author(s)

Farideh Bagherzadeh-Khiabani, Davood Khalili


faridehbagherzadeh/VariableSelection documentation built on May 16, 2019, 10:10 a.m.