Description Usage Arguments Details Value References See Also Examples
View source: R/feature.boruta.R
Wrapper around the Boruta package. Boruta is a so called all relevant feature seletion wrapper, capable of working with each classifier outputting variable importance measure (VIM). This function provides a wrapper ensuring correct provision of input data and the potential to execute convenience functions that e.g. provide regression formula output.
1 2 3 4 | feature.boruta.comp(target, predictors, fixNA = F, roughFix = F,
variables = F, selected = F, formula = F, tentative = F,
pValue = 0.01, mcAdj = T, maxRuns = 100, doTrace = 0,
holdHistory = T, getImp = Boruta::getImpRfZ, verbose = F, ...)
|
target |
Response vector; factor for classification, numeric vector for regression. |
predictors |
|
fixNA |
|
roughFix |
|
variables |
|
selected |
|
formula |
|
tentative |
|
pValue |
Confidence level. Default value should be used. Default is 0.01. |
mcAdj |
If set to |
maxRuns |
Maximal number of importance source runs. You may increase it to resolve attributes left tentative. Default is 100. |
doTrace |
Verbosity level. 0 means no tracing, 1 means reporting decision about each attribute as soon as it is justified, 2 means same as 1, plus reporting each importance source run. Default is 0. |
holdHistory |
If set to |
getImp |
Function used to obtain attribute importance. The default
is |
verbose |
|
The method first saves the name of the original target
parameter so it
is potentially reusable for formula creation later on. In case the
fixNA
switch is TRUE
, all observations containing NA
values will be eliminated. If this should affect all observations an error
will be produced. Before executing the Boruta algorithm, the important input
parameters target
and predictors
will be checked via the
feature.boruta.fixNA
method. Should any issues with the input
be found (wrong data types, differing lengths, NA
s) an appropriate
error will be thrown.
Next the actual Boruta::Boruta
algorithm is executed with the
provided parameters. Bortua than iteratively compares the importance of
shadow attributes with the original attributes. Those with a significantly
worse performance than shadow attributes will be rejected; those performing
significantly better will be confirmed.
Since the Boruta algorithm might not converge in the given maxRuns
iterations, the Boruta::TentativeRoughFix
can be used to
resolve still missing values (given roughFix
is TRUE
).
Finally, depending on the values of the variables
and formula
switches, a formula will be created and/or the confirmed/rejected/tentative
attributes are appended to the returned Boruta
object.
Boruta
object as it is also returned by the underlying
Boruta::Boruta
method. This default return value can
include severeal extensions, depending on parameters like
formula
:
target |
The name of the target vector. |
variables |
Variable names of all three categories (Confirmed, Tentative, Rejected) |
selected |
Variables names of confirmed and tentative variables in one vector. |
formula |
Formula of the form
|
Miron B. Kursa, Witold R. Rudnicki (2010). Feature Selection with the Boruta Package. Journal of Statistical Software, 36(11), p. 1-13. URL: http://www.jstatsoft.org/v36/i11/
feature.boruta.checkInputParams
1 2 3 4 | KaggleHouse:::feature.boruta(
target = data_train_na$SalePrice, predictors = data_train_na[-81],
fixNA = T, roughFix = T, verbose = T
)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.