VSURF_pred | R Documentation |
Prediction step refines the selection of interpretation step
VSURF_interp
by eliminating redundancy in the set of variables
selected, for prediction purpose. This is the third step of the
VSURF
function.
VSURF_pred(x, ...)
## Default S3 method:
VSURF_pred(
x,
y,
err.interp,
varselect.interp,
ntree.pred = 100,
nfor.pred = 10,
nmj = 1,
RFimplem = "randomForest",
parallel = FALSE,
ncores = detectCores() - 1,
verbose = TRUE,
ntree = NULL,
...
)
## S3 method for class 'formula'
VSURF_pred(formula, data, ..., na.action = na.fail)
x , formula |
A data frame or a matrix of predictors, the columns represent the variables. Or a formula describing the model to be fitted. |
... |
others parameters to be passed on to the |
y |
A response vector (must be a factor for classification problems and numeric for regression ones). |
err.interp |
A vector of the mean OOB error rates of the embedded random
forests models build during interpretation step (value |
varselect.interp |
A vector of indices of variables selected after interpretation step. |
ntree.pred |
Number of trees of each forest grown. |
nfor.pred |
Number of forests grown. |
nmj |
Number of times the mean jump is multiplied. See details below. |
RFimplem |
Choice of the random forests implementation to use :
"randomForest" (default), "ranger" or "Rborist" (not that if "Rborist" is
chosen, "randoForest" will still be used for the first step
|
parallel |
A logical indicating if you want VSURF to run in parallel on
multiple cores (default to FALSE). If a vector of length 3 is given,
each coordinate is passed to each intermediate function: |
ncores |
Number of cores to use. Default is set to the number of cores detected by R minus 1. |
verbose |
A logical indicating if information about method's progress (included progress bars for each step) must be printed (default to TRUE). Adds a small extra overload. |
ntree |
(deprecated) Number of trees in each forest grown for "thresholding step". |
data |
a data frame containing the variables in the model. |
na.action |
A function to specify the action to be taken if NAs are
found. (NOTE: If given, this argument must be named, and as
|
nfor.pred
embedded random forests models are grown, starting with the
random forest build with only the most important variable. Variables are
added to the model in a stepwise manner. The mean jump value mean.jump
is calculated using variables that have been left out by interpretation step,
and is set as the mean absolute difference between mean OOB errors of one
model and its first following model. Hence a variable is included in the
model if the mean OOB error decrease is larger than nmj
*
mean.jump
.
Note that, the mtry
parameter of randomForest
is set to its
default value (see randomForest
) if nvm
, the number of
variables in the model, is not greater than the number of observations, while
it is set to nvm/3
otherwise. This is to ensure quality of OOB error
estimations along embedded RF models.
An object of class VSURF_pred
, which is a list with the
following components:
varselect.pred |
A vector of indices of variables selected after "prediction step". |
err.pred |
A vector of the mean OOB error rates of the random forests models build during the "prediction step". |
mean.jump |
The mean jump value computed during the "prediction step". |
num.varselect.pred |
The number of selected variables. |
nmj |
Value of the parameter in the call. |
comput.time |
Computation time. |
RFimplem |
The RF implementation used to run
|
call |
The original call to |
terms |
Terms associated to the formula (only if formula-type call was used). |
Robin Genuer, Jean-Michel Poggi and Christine Tuleau-Malot
Genuer, R. and Poggi, J.M. and Tuleau-Malot, C. (2010), Variable selection using random forests, Pattern Recognition Letters 31(14), 2225-2236
Genuer, R. and Poggi, J.M. and Tuleau-Malot, C. (2015), VSURF: An R Package for Variable Selection Using Random Forests, The R Journal 7(2):19-33
VSURF
data(iris)
iris.thres <- VSURF_thres(iris[,1:4], iris[,5])
iris.interp <- VSURF_interp(iris[,1:4], iris[,5],
vars = iris.thres$varselect.thres)
iris.pred <- VSURF_pred(iris[,1:4], iris[,5],
err.interp = iris.interp$err.interp,
varselect.interp = iris.interp$varselect.interp)
iris.pred
## Not run:
# A more interesting example with toys data (see \code{\link{toys}})
# (a few minutes to execute)
data(toys)
toys.thres <- VSURF_thres(toys$x, toys$y)
toys.interp <- VSURF_interp(toys$x, toys$y,
vars = toys.thres$varselect.thres)
toys.pred <- VSURF_pred(toys$x, toys$y, err.interp = toys.interp$err.interp,
varselect.interp = toys.interp$varselect.interp)
toys.pred
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.