vimp.rfsrc | R Documentation |
Calculate variable importance (VIMP) for a single variable or group of variables for training or test data.
## S3 method for class 'rfsrc'
vimp(object, xvar.names,
importance = c("anti", "permute", "random"), block.size = 10,
joint = FALSE, seed = NULL, do.trace = FALSE, ...)
object |
An object of class |
xvar.names |
Character vector of x-variable names to be evaluated. If not specified, all variables are used. |
importance |
Type of variable importance (VIMP) to compute. |
block.size |
Integer specifying the number of trees per block used for VIMP calculation. Balances between ensemble-level and tree-level estimates. |
joint |
Logical indicating whether to compute joint VIMP for the specified variables. |
seed |
Negative integer used to set the random number generator seed. |
do.trace |
Number of seconds between printed progress updates. |
... |
Additional arguments passed to or from other methods. |
Using a previously trained forest, this function calculates variable importance (VIMP) for the specified variables in xvar.names
. By default, VIMP is computed using the original training data, but the user may supply a new test set via the newdata
argument. See rfsrc
for further details on how VIMP is computed.
If joint = TRUE
, joint VIMP is returned. This is defined as the importance of a group of variables when the entire group is perturbed simultaneously.
Setting csv = TRUE
returns case-specific VIMP, which provides VIMP estimates at the individual observation level. This applies to all families except survival. See examples below.
An object of class (rfsrc, predict)
containing importance
values.
Hemant Ishwaran and Udaya B. Kogalur
Ishwaran H. (2007). Variable importance in binary regression trees and forests, Electronic J. Statist., 1:519-537.
holdout.vimp.rfsrc
,
rfsrc
## ------------------------------------------------------------
## classification example
## showcase different vimp
## ------------------------------------------------------------
iris.obj <- rfsrc(Species ~ ., data = iris)
## anti vimp (default)
print(vimp(iris.obj)$importance)
## anti vimp using brier prediction error
print(vimp(iris.obj, perf.type = "brier")$importance)
## permutation vimp
print(vimp(iris.obj, importance = "permute")$importance)
## random daughter vimp
print(vimp(iris.obj, importance = "random")$importance)
## joint anti vimp
print(vimp(iris.obj, joint = TRUE)$importance)
## paired anti vimp
print(vimp(iris.obj, c("Petal.Length", "Petal.Width"), joint = TRUE)$importance)
print(vimp(iris.obj, c("Sepal.Length", "Petal.Width"), joint = TRUE)$importance)
## ------------------------------------------------------------
## survival example
## anti versus permute VIMP with different block sizes
## ------------------------------------------------------------
data(pbc, package = "randomForestSRC")
pbc.obj <- rfsrc(Surv(days, status) ~ ., pbc)
print(vimp(pbc.obj)$importance)
print(vimp(pbc.obj, block.size=1)$importance)
print(vimp(pbc.obj, importance="permute")$importance)
print(vimp(pbc.obj, importance="permute", block.size=1)$importance)
## ------------------------------------------------------------
## imbalanced classification example
## see the imbalanced function for more details
## ------------------------------------------------------------
data(breast, package = "randomForestSRC")
breast <- na.omit(breast)
f <- as.formula(status ~ .)
o <- rfsrc(f, breast, ntree = 2000)
## permutation vimp
print(100 * vimp(o, importance = "permute")$importance)
## anti vimp using gmean performance
print(100 * vimp(o, perf.type = "gmean")$importance[, 1])
## ------------------------------------------------------------
## regression example
## ------------------------------------------------------------
airq.obj <- rfsrc(Ozone ~ ., airquality)
print(vimp(airq.obj))
## ------------------------------------------------------------
## regression example where vimp is calculated on test data
## ------------------------------------------------------------
set.seed(100080)
train <- sample(1:nrow(airquality), size = 80)
airq.obj <- rfsrc(Ozone~., airquality[train, ])
## training data vimp
print(airq.obj$importance)
print(vimp(airq.obj)$importance)
## test data vimp
print(vimp(airq.obj, newdata = airquality[-train, ])$importance)
## ------------------------------------------------------------
## case-specific vimp
## returns VIMP for each case
## ------------------------------------------------------------
o <- rfsrc(mpg~., mtcars)
v <- vimp(o, csv = TRUE)
csvimp <- get.mv.csvimp(v, standardize=TRUE)
print(csvimp)
## ------------------------------------------------------------
## case-specific joint vimp
## returns joint VIMP for each case
## ------------------------------------------------------------
o <- rfsrc(mpg~., mtcars)
v <- vimp(o, joint = TRUE, csv = TRUE)
csvimp <- get.mv.csvimp(v, standardize=TRUE)
print(csvimp)
## ------------------------------------------------------------
## case-specific joint vimp for multivariate regression
## returns joint VIMP for each case, for each outcome
## ------------------------------------------------------------
o <- rfsrc(Multivar(mpg, cyl) ~., data = mtcars)
v <- vimp(o, joint = TRUE, csv = TRUE)
csvimp <- get.mv.csvimp(v, standardize=TRUE)
print(csvimp)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.