Description Usage Arguments Details Value Author(s) See Also Examples
Controls the execution of models with simple filters for feature selection
1 2 3 4 5 6 7 8 9 10 11 12 | sbfControl(functions = NULL,
method = "boot",
saveDetails = FALSE,
number = ifelse(method %in% c("cv", "repeatedcv"), 10, 25),
repeats = ifelse(method %in% c("cv", "repeatedcv"), 1, number),
verbose = FALSE,
returnResamp = "final",
p = 0.75,
index = NULL,
timingSamps = 0,
seeds = NA,
allowParallel = TRUE)
|
functions |
a list of functions for model fitting, prediction and variable filtering (see Details below) |
method |
The external resampling method: |
number |
Either the number of folds or number of resampling iterations |
repeats |
For repeated k-fold cross-validation only: the number of complete sets of folds to compute |
saveDetails |
a logical to save the predictions and variable importances from the selection process |
verbose |
a logical to print a log for each external resampling iteration |
returnResamp |
A character string indicating how much of the resampled summary metrics should be saved. Values can be “final” or “none” |
p |
For leave-group out cross-validation: the training percentage |
index |
a list with elements for each external resampling iteration. Each list element is the sample rows used for training at that iteration. |
timingSamps |
the number of training set samples that will be used to measure the time for predicting samples (zero indicates that the prediction time should not be estimated). |
seeds |
an optional set of integers that will be used to set the seed at each resampling iteration. This is useful when the models are run in parallel. A value of |
allowParallel |
if a parallel backend is loaded and available, should the function use it? |
More details on this function can be found at http://caret.r-forge.r-project.org/featureselection.html.
Simple filter-based feature selection requires function to be specified for some operations.
The fit
function builds the model based on the current data set. The arguments for the function must be:
x
the current training set of predictor data with
the appropriate subset of variables (i.e. after filtering)
y
the current outcome data (either a numeric or
factor vector)
...
optional arguments to pass to the fit
function in the call to sbf
The function should return a model object that can be used to generate predictions.
The pred
function returns a vector of predictions (numeric or factors) from the current model. The arguments are:
object
the model generated by the fit
function
x
the current set of predictor set for the
held-back samples
The score
function is used to return a vector of scores with names for each predictor (such as a p-value). Inputs are:
x
the predictors for the training samples
y
the current training outcomes
The function should return a vector, as previously stated. Examples are give by anovaScores
for classification and gamScores
for regression.
The filter
function is used to return a logical vector with names for each predictor (TRUE
indicates that the prediction should be retained). Inputs are:
score
the output of the score
function
x
the predictors for the training samples
y
the current training outcomes
The function should return a named logical vector.
Examples of these functions are included in the package: caretSBF
, lmSBF
, rfSBF
, treebagSBF
, ldaSBF
and nbSBF
.
The web page http://caret.r-forge.r-project.org/ has more details and examples related to this function.
a list that echos the specified arguments
Max Kuhn
sbf
, caretSBF
, lmSBF
, rfSBF
, treebagSBF
, ldaSBF
and nbSBF
1 2 3 4 5 6 7 8 9 10 11 12 | ## Not run:
data(BloodBrain)
## Use a GAM is the filter, then fit a random forest model
RFwithGAM <- sbf(bbbDescr, logBBB,
sbfControl = sbfControl(functions = rfSBF,
verbose = FALSE,
seeds = sample.int(100000, 11),
method = "cv"))
RFwithGAM
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.