Description Usage Arguments Details Value Author(s) References See Also Examples
Calculate variable importance (VIMP) for a single variable or group of variables for training or test data.
1 2 3 4 5 |
object |
An object of class |
xvar.names |
Names of the x-variables to be used. If not specified all variables are used. |
outcome.target |
Character vector for multivariate families specifying the target outcomes to be used. The default is to use all coordinates. |
importance |
Type of VIMP. |
joint |
Individual or joint VIMP? |
subset |
Vector indicating which rows of the grow data to
restrict VIMP calculations to; i.e. this option yields VIMP which is
restricted to a specific subset of the data. Note that the vector
should correspond to the rows of |
seed |
Negative integer specifying seed for the random number generator. |
do.trace |
Number of seconds between updates to the user on approximate time to completion. |
... |
Further arguments passed to or from other methods. |
Using a previously grown forest, calculate the VIMP for variables
xvar.names
. By default, VIMP is calculated for the original
data, but the user can specify a new test data for the VIMP
calculation using newdata
. Depending upon the option
importance
, VIMP is calculated either by random daughter
assignment or by random permutation of the variable(s). The default
is Breiman-Cutler permutation VIMP. See rfsrc
for more
details.
Joint VIMP is requested using joint. The joint VIMP is the importance for a group of variables when the group is perturbed simultaneously.
An object of class (rfsrc, predict)
, which is a list with
the following key components:
err.rate |
OOB error rate for the ensemble restricted to the subsetted data. |
importance |
Variable importance (VIMP). |
Hemant Ishwaran and Udaya B. Kogalur
Ishwaran H. (2007). Variable importance in binary regression trees and forests, Electronic J. Statist., 1:519-537.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | ## Not run:
## ------------------------------------------------------------
## classification example
## showcase different vimp
## ------------------------------------------------------------
iris.obj <- rfsrc(Species ~ ., data = iris)
# Breiman-Cutler permutation vimp
print(vimp(iris.obj)$importance)
# Breiman-Cutler random daughter vimp
print(vimp(iris.obj, importance = "random")$importance)
# Breiman-Cutler joint permutation vimp
print(vimp(iris.obj, joint = TRUE)$importance)
# Breiman-Cuter paired vimp
print(vimp(iris.obj, c("Petal.Length", "Petal.Width"), joint = TRUE)$importance)
print(vimp(iris.obj, c("Sepal.Length", "Petal.Width"), joint = TRUE)$importance)
## ------------------------------------------------------------
## regression example
## compare Breiman-Cutler vimp to ensemble based vimp
## ------------------------------------------------------------
airq.obj <- rfsrc(Ozone ~ ., airquality)
vimp.all <- cbind(
ensemble = vimp(airq.obj, importance = "permute.ensemble")$importance,
breimanCutler = vimp(airq.obj, importance = "permute")$importance)
print(vimp.all)
## ------------------------------------------------------------
## regression example
## calculate VIMP on test data
## ------------------------------------------------------------
set.seed(100080)
train <- sample(1:nrow(airquality), size = 80)
airq.obj <- rfsrc(Ozone~., airquality[train, ])
#training data vimp
print(airq.obj$importance)
print(vimp(airq.obj)$importance)
#test data vimp
print(vimp(airq.obj, newdata = airquality[-train, ])$importance)
## ------------------------------------------------------------
## survival example
## study how vimp depends on tree imputation
## makes use of the subset option
## ------------------------------------------------------------
data(pbc, package = "randomForestSRC")
# determine which records have missing values
which.na <- apply(pbc, 1, function(x){any(is.na(x))})
# impute the data using na.action = "na.impute"
pbc.obj <- rfsrc(Surv(days,status) ~ ., pbc, nsplit = 3,
na.action = "na.impute", nimpute = 1)
# compare vimp based on records with no missing values
# to those that have missing values
# note the option na.action="na.impute" in the vimp() call
vimp.not.na <- vimp(pbc.obj, subset = !which.na, na.action = "na.impute")$importance
vimp.na <- vimp(pbc.obj, subset = which.na, na.action = "na.impute")$importance
print(data.frame(vimp.not.na, vimp.na))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.