Home

/

GitHub

/

faridehbagherzadeh/VariableSelection

/

control.selection: Controll the variable selection process

ptognostic situations

Description Usage Arguments Value Author(s) References See Also Examples

Used to set various parameters controlling the variable selection process.

control.selection(disc.method = "mdlp", bins = 5, ranker.search = "BR",
  ranker.k = 20, ranker.p = 50, subset.search = "BF",
  search.backtracks = 5, model = "logistic", model.evaluation = "aic",
  consistency.measure = "Liu", correlation.measure = "Chi2",
  information.measure = "SU", CFS.measure = "SU", OneR.measure = "acc",
  RF.measure = "acc", RF.ntree = 1000, Relief.nneighbour = 5,
  Relief.nsample = 10)

`disc.method`	a character string selecting the discretization method to be used, meaningful only for variable selection methods which can not handle numerical variables; the available options are 6 supervised methods: "chi2" (using Chi2 algorithm: Liu and Setiono (1997)), "chiM" (using ChiMerge algorithm: Kerber (1992)), "disc.Topdown" (using Top-down method: Gonzalez-Abril and et al. (2009)), "extendChi2" (using the Extended Chi2 algorithm: Su and Hsu (2005)), "mdlp" (using Minimum Description Length Principle: Fayyad and Irani (1993)), "modChi2" (using Modified Chi2 method: Tay and Shen (2002)) and 2 unsupervised methods: "EB" (using Equal Binning method), "EF" (using Equal Frequency method) The default is "mdlp".
`bins`	a positive integer meaningful only in the "EB" and "EF" discretization method. It specifies the number of bins. The default is 5.
`ranker.search`	a character vector selecting the methods to search and transform the ranking of variables to a subset of variables used in ranker variable selection methods: "BR" (Best Rank search which selectes a subset of the highest ranked variables among all possible subsets of highest ranked variables); "FR" (Fixed Rank search which selectes a fixed number of highest ranked variables); The default is "BR".
`ranker.k`	a positive integer meaningful only in FR indicating the number of variables to be selected. Default is 20.
`ranker.p`	a numeric between 0 and 1 meaningful only in PR indicating the percent of variables to be selected. The default is 50.
`subset.search`	a character vector selecting the search method used in subset selector variable selection methods: "SF" (Sequential Forward search); "SB" (Sequential Backward search); "HC" (Hill Climbing search); "BF" (Best First Search) (Dash and Liu (1997)). DThe default is "BF".
`search.backtracks`	number of backtracks when applying a best first search (Dash and Liu (1997)).
`model`	a character string selecting the prediction model for which the variables are selected; currently the only available option is "logistic". The default is "logistic".
`model.evaluation`	a character string selecting the model evaluation measure; currently the only available option is "aic" (Akaike Information Criterion). The default is "aic".
`consistency.measure`	a character vector selecting the measure used for consistency; currently the only available option is "Liu" (Arauzo-Azofra and et al. (2008)). The default is "Liu".
`correlation.measure`	a character vector selecting the measure used for correlation; currently the available options are "Chi2" (Chi-squared statistic) and "CramerV" (Harald Cramer (1946)). The default is "Chi2".
`information.measure`	a character vector selecting the measure used for information; currently the available options are "IG" (Information Gain); "GR" (Gain Ratio) and "SU" (Symmetrical Uncertainty) (Novakovic (2011)). The default is "SU".
`CFS.measure`	a character vector selecting the dependency measure used in CFS. Options are "IG", "GR", "SU", "Chi2" and "CramerV" (Ladha and Deepa (2011)). The default is "SU".
`OneR.measure`	character vector selecting the OneR tree evaluation criterion (Jensen and Shen (2009)). The default is "acc" (accuracy).
`RF.measure`	a character vector specifying the type of importance measure used in random forest. Options are "acc" (Accuracy) and "imp" (Impurity) (Livingston (2005)). The default is "acc".
`RF.ntree`	an integer meaningful only in random forest method to set the number of trees. The default is 1000.
`Relief.nneighbour`	an integer meaningful only in Relief to set the number of neighbours to find for every sampled instance (Cehovin (2010)). The default is 5.
`Relief.nsample`	an integer meaningful only in Relief to set number of instances to sample (Cehovin (2010)). The default is 10.

A list with components for each of the possible arguments.

Farideh Bagherzadeh-Khiabani.

Liu, H. and Setiono, R. (1997). Feature selection and discretization, IEEE transactions on knowledge and data engineering, 9(4), 642–645.

Kerber, R. (1992). ChiMerge : Discretization of numeric attributes, In Proceedings of the Tenth National Conference on Artificial Intelligence, 123–128.

Gonzalez-Abril, L., Cuberos, F. J., Velasco, F. and Ortega, J. A. (2009) Ameva: An autonomous discretization algorithm, Expert Systems with Applications, 36, 5327–5332.

Su, C. T. and Hsu, J. H. (2005). An Extended Chi2 Algorithm for Discretization of Real Value Attributes, IEEE transactions on knowledge and data engineering, 17, 437–441.

Fayyad, U. M. and Irani, K. B.(1993). Multi-interval discretization of continuous-valued attributes for classification learning, Artificial intelligence, 13, 1022–1027.

Tay, F. E. H. and Shen, L. (2002). Modified Chi2 Algorithm for Discretization, IEEE Transactions on knowledge and data engineering, 14, 666–670.

Dash, M. and H. Liu (1997). Feature selection for classification. Intelligent Data Analysis, 1(30), 131–156.

Arauzo-Azofra, A., J.M. Benitez, and J.L. Castro. Consistency measures for feature selection (2008). Journal of Intelligent Information Systems, 30(3), 273–292.

Cramer, H. (1946). Mathematical Methods of Statistics. Press, Princeton, NJ, 367–369

Novakovic, J., Strbac, P., & Bulatovic, D. (2011). Toward optimal feature selection using ranking methods and classification algorithms. Yugoslav Journal of Operations Research ISSN: 0354-0243 EISSN: 2334-6043, 21(1).

Ladha, L. and T. Deepa (2011). Feature selection methods and algorithms. International Journal on Computer Science and Engineering, 3(5), 1787–1797.

Jensen, R. and Q. Shen (2009). Feature selection for aiding glass forensic evidence analysis. Intelligent Data Analysis, 13(5), 703–723.

???Livingston, F., Implementation of Breiman's random forest machine learning algorithm. ECE591Q Machine Learning Journal Paper, 2005.

???Cehovin, L., Empirical evaluation of feature selection methods in classification. 2010.

variable.selection

library(VariableSelection)
 data(tlgs)

 ###########################################
 # Variable Selection with "wrapper" method:
 object <- variable.selection (input=names(tlgs)[-1],
 target=names(tlgs)[1], data=tlgs, methods=c("wrapper"),
 control = control.selection())
 print(object)

 ###########################################
 # Variable Selection with "wrapper" method:
 # Change the Search Method used in Variable Selection through parameters of control.selection
 object <- variable.selection (input=names(tlgs)[-1],
 target=names(tlgs)[1], data=tlgs,
 methods=c("wrapper"), control = control.selection(subset.search="HC" ))

 print(object)

faridehbagherzadeh/VariableSelection documentation built on May 16, 2019, 10:10 a.m.

faridehbagherzadeh/VariableSelection index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

faridehbagherzadeh/VariableSelection
Identify the appropriate set of predictors of a binary outcome of interest for diagnostic/ptognostic situations

control.selection: Controll the variable selection process
In faridehbagherzadeh/VariableSelection: Identify the appropriate set of predictors of a binary outcome of interest for diagnostic/ptognostic situations

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Related to control.selection in faridehbagherzadeh/VariableSelection...

R Package Documentation

Browse R Packages

We want your feedback!

faridehbagherzadeh/VariableSelection Identify the appropriate set of predictors of a binary outcome of interest for diagnostic/ptognostic situations

control.selection: Controll the variable selection process In faridehbagherzadeh/VariableSelection: Identify the appropriate set of predictors of a binary outcome of interest for diagnostic/ptognostic situations

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Related to control.selection in faridehbagherzadeh/VariableSelection...

R Package Documentation

Browse R Packages

We want your feedback!

faridehbagherzadeh/VariableSelection
Identify the appropriate set of predictors of a binary outcome of interest for diagnostic/ptognostic situations

control.selection: Controll the variable selection process
In faridehbagherzadeh/VariableSelection: Identify the appropriate set of predictors of a binary outcome of interest for diagnostic/ptognostic situations