variableImportance: Permutation- based Variable Importance Measures

View source: R/variableImportance.R

variableImportanceR Documentation

Permutation- based Variable Importance Measures

Description

variableImportance produces permutation- based variable importance measures (currently only for binary classification models from the package randomForest and only for the performance measure AUROC)

Usage

variableImportance(
  object = NULL,
  xdata = NULL,
  ydata = NULL,
  CV = 3,
  measure = "AUROC",
  sort = TRUE
)

Arguments

object

A model. Currently only binary classification models from the package randomForest.

xdata

A data frame containing the predictors for the model.

ydata

A factor containing the response variable.

CV

Cross-validation. How many times should the data be permuted and the decrease in performance be calculated? Afterwards the mean is taken. CV should be higher for very small samples to ensure stability.

measure

Currently only Area Under the Receiver Operating Characteristic Curve (AUROC) is supported.

sort

Logical. Should the results be sorted from high to low?

Details

Currently only binary classification models from randomForest are supported. Also, currently only AUROC is supported. Definition of MeanDecreaseAUROC: for the entire ensemble the AUROC is recorded on the provided xdata. The same is subsequently done after permuting each variable (iteratively, for each variable separately). Then the latter is subtracted from the former. This is called the Decrease in AUROC. If we do this for multiple CV, it becomes the Mean Decrease in AUROC.

Value

A data frame containing the variable names and the mean decrease in AUROC

Author(s)

Authors: Michel Ballings, and Dirk Van den Poel, Maintainer: Michel.Ballings@GMail.com

See Also

parDepPlot

Examples

#Prepare data
data(iris)
iris <- iris[1:100,]
iris$Species <- as.factor(ifelse(factor(iris$Species)=="setosa",0,1))
#Estimate model
library(randomForest)
ind <- sample(nrow(iris),50)
rf <- randomForest(Species~., iris[ind,])
#Obtain variable importances
variableImportance(object=rf, xdata=iris[-ind,names(iris) != "Species"],
ydata=iris[-ind,]$Species) 

interpretR documentation built on Aug. 20, 2023, 1:07 a.m.