| MDIoobTree | R Documentation |
Calculate the MDI-oob feature importance measure.
MDIoobTree(tidy.RF, tree, trainX, trainY)
MDIoob(tidy.RF, trainX, trainY)
tidy.RF |
A tidy random forest. The random forest to calculate MDI-oob from. |
tree |
An integer. The index of the tree to look at. |
trainX |
A data frame. Train set features, such that the |
trainY |
A data frame. Train set responses, such that the |
It has long been known that MDI incorrectly assigns high importance to noisy features, leading to systematic bias in feature selection. To address this issue, Li et al. proposed a debiased MDI feature importance measure using out-of-bag samples, called MDI-oob, which has achieved state-of-the-art performance in feature selection for both simulated and real data.
See vignette('MDI', package='tree.interpreter') for more context.
A matrix. The content depends on the type of the response.
Regression: A P-by-1 matrix, where P is the number of features in
X. The pth row contains the MDI-oob of feature p.
Classification: A P-by-D matrix, where P is the number of features
in X and D is the number of response classes. The dth column of
the pth row contains the MDI-oob of feature p to class d. You can get
the MDI-oob of each feature by calling rowSums on the result.
MDIoobTree: Debiased mean decrease in impurity within a single tree
MDIoob: Debiased mean decrease in impurity within the whole
forest
A Debiased MDI Feature Importance Measure for Random Forests https://arxiv.org/abs/1906.10845
MDI
vignette('MDI', package='tree.interpreter')
library(ranger)
rfobj <- ranger(Species ~ ., iris, keep.inbag=TRUE)
tidy.RF <- tidyRF(rfobj, iris[, -5], iris[, 5])
MDIoobTree(tidy.RF, 1, iris[, -5], iris[, 5])
MDIoob(tidy.RF, iris[, -5], iris[, 5])
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.