GiniImportanceForest: computes inbag and OOB Gini importance averaged over all...

Description Usage Arguments Value Author(s) Examples

Description

workhorse function of this package

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
GiniImportanceForest(RF, data, ylabel = "Survived", 


    zeroLeaf = TRUE, agg = c("mean", "median", "none")[1], 


    score = c("PMDI21", "MDI", "MDA", "MIA")[1], Predictor = mean, 


    correctBias = c(inbag = TRUE, outbag = TRUE), ImpTypes = 0:5, 


    verbose = 0)

Arguments

RF

object returned by call to randomForest()

data

data which was used to train the RF. NOTE: assumes setting of inbag=TRUE while training

ylabel

name of dependent variable

zeroLeaf

if TRUE discard the information gain due to splits resulting in n=1

agg

method of aggregating importance scores across trees. If "none" return the raw arrays (for debugging)

score

scoring method:MDI=mean decrease impurity (Gini),MDA=mean decrease accuracy (permutation),MIA=mean increase accuracy

Predictor

function to estimate node prediction, such as Mode or mean or median. Alternatively, pass an array of numbers as replacement for the yHat column of tree

correctBias

multiply by n/(n-1) for sample variance correction!

ImpTypes

which scores should be computed

verbose

level of verbosity

Value

matrix with variable importance scores and their stdevs

Author(s)

Markus Loecher <Markus.Loecher@gmail.com>

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
data("titanic_train", package = "rfVarImpOOB",  envir = environment())


set.seed(123)


ranRows=sample(nrow(titanic_train), 300)


data=titanic_train[ranRows,]





RF = randomForest::randomForest(formula = Survived ~ Sex + Pclass + PassengerId,


                                data=data,


                                ntree=5,importance=TRUE,


                                mtry=3,keep.inbag=TRUE, 


                                nodesize = 20)


data$Survived = as.numeric(data$Survived)-1


VI_Titanic = GiniImportanceForest(RF, data,ylab="Survived")

markusloecher/rfVarImpOOB documentation built on July 5, 2020, 6:50 p.m.