sbfControl: Control Object for Selection By Filtering (SBF)

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Controls the execution of models with simple filters for feature selection

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
sbfControl(functions = NULL, 
           method = "boot", 
           saveDetails = FALSE, 
           number = ifelse(method %in% c("cv", "repeatedcv"), 10, 25),
           repeats = ifelse(method %in% c("cv", "repeatedcv"), 1, number),
           verbose = FALSE, 
           returnResamp = "final", 
           p = 0.75, 
           index = NULL, 
           timingSamps = 0,
           seeds = NA,
           allowParallel = TRUE)

Arguments

functions

a list of functions for model fitting, prediction and variable filtering (see Details below)

method

The external resampling method: boot, cv, LOOCV or LGOCV (for repeated training/test splits

number

Either the number of folds or number of resampling iterations

repeats

For repeated k-fold cross-validation only: the number of complete sets of folds to compute

saveDetails

a logical to save the predictions and variable importances from the selection process

verbose

a logical to print a log for each external resampling iteration

returnResamp

A character string indicating how much of the resampled summary metrics should be saved. Values can be “final” or “none”

p

For leave-group out cross-validation: the training percentage

index

a list with elements for each external resampling iteration. Each list element is the sample rows used for training at that iteration.

timingSamps

the number of training set samples that will be used to measure the time for predicting samples (zero indicates that the prediction time should not be estimated).

seeds

an optional set of integers that will be used to set the seed at each resampling iteration. This is useful when the models are run in parallel. A value of NA will stop the seed from being set within the worker processes while a value of NULL will set the seeds using a random set of integers. Alternatively, a vector of integers can be used. The vector should have B+1 elements where B is the number of resamples. See the Examples section below.

allowParallel

if a parallel backend is loaded and available, should the function use it?

Details

More details on this function can be found at http://caret.r-forge.r-project.org/featureselection.html.

Simple filter-based feature selection requires function to be specified for some operations.

The fit function builds the model based on the current data set. The arguments for the function must be:

The function should return a model object that can be used to generate predictions.

The pred function returns a vector of predictions (numeric or factors) from the current model. The arguments are:

The score function is used to return a vector of scores with names for each predictor (such as a p-value). Inputs are:

The function should return a vector, as previously stated. Examples are give by anovaScores for classification and gamScores for regression.

The filter function is used to return a logical vector with names for each predictor (TRUE indicates that the prediction should be retained). Inputs are:

The function should return a named logical vector.

Examples of these functions are included in the package: caretSBF, lmSBF, rfSBF, treebagSBF, ldaSBF and nbSBF.

The web page http://caret.r-forge.r-project.org/ has more details and examples related to this function.

Value

a list that echos the specified arguments

Author(s)

Max Kuhn

See Also

sbf, caretSBF, lmSBF, rfSBF, treebagSBF, ldaSBF and nbSBF

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 
data(BloodBrain)

## Use a GAM is the filter, then fit a random forest model
RFwithGAM <- sbf(bbbDescr, logBBB,
                 sbfControl = sbfControl(functions = rfSBF,
                                         verbose = FALSE, 
                                         seeds = sample.int(100000, 11),
                                         method = "cv"))
RFwithGAM

## End(Not run)

caret documentation built on May 2, 2019, 5:47 p.m.

Related to sbfControl in caret...