GiniImportanceForest: computes inbag and OOB Gini importance averaged over all...

GiniImportanceForestR Documentation

computes inbag and OOB Gini importance averaged over all trees in a forest

Description

workhorse function of this package

Usage

GiniImportanceForest(RF, data, ylabel = "Survived", zeroLeaf = TRUE, 


    agg = c("mean", "median", "none")[1], score = c("PMDI21", 


        "MDI", "MDA", "MIA")[1], Predictor = Mode, verbose = 0)

Arguments

RF

object returned by call to randomForest()

data

data which was used to train the RF. NOTE: assumes setting of inbag=TRUE while training

ylabel

name of dependent variable

zeroLeaf

if TRUE discard the information gain due to splits resulting in n=1

agg

method of aggregating importance scores across trees. If "none" return the raw arrays (for debugging)

score

scoring method:MDI=mean decrease impurity (Gini),MDA=mean decrease accuracy (permutation),MIA=mean increase accuracy

Predictor

function to estimate node prediction, such as Mode or mean or median. Alternatively, pass an array of numbers as replacement for the yHat column of tree

verbose

level of verbosity

Value

matrix with variable importance scores and their stdevs

Author(s)

Markus Loecher <Markus.Loecher@gmail.com>

Examples






data("titanic_train", package = "rfVarImpOOB",  envir = environment())


set.seed(123)


ranRows=sample(nrow(titanic_train), 300)


data=titanic_train[ranRows,]





RF = randomForest::randomForest(formula = Survived ~ Sex + Pclass + PassengerId,


                                data=data,


                                ntree=5,importance=TRUE,


                                mtry=3,keep.inbag=TRUE, 


                                nodesize = 20)


data$Survived = as.numeric(data$Survived)-1


VI_Titanic = GiniImportanceForest(RF, data,ylab="Survived")



rfVarImpOOB documentation built on July 1, 2022, 5:05 p.m.