tree.interpreter: Random Forest Prediction Decomposition and Feature Importance...

Description tidyRF The featureContrib and trainsetBias families The MDI and MDIoob families Examples

Description

An R re-implementation of the 'treeinterpreter' package on PyPI. <https://pypi.org/project/treeinterpreter/>. Each prediction can be decomposed as 'prediction = bias + feature_1_contribution + ... + feature_n_contribution'. This decomposition is then used to calculate the Mean Decrease Impurity (MDI) and Mean Decrease Impurity using out-of-bag samples (MDI-oob) feature importance measures based on the work of Li et al. (2019) <arXiv:1906.10845>.

tidyRF

The function tidyRF can turn a randomForest or ranger object into a package-agnostic random forest object. All other functions in this package operate on such a tidyRF object.

The featureContrib and trainsetBias families

The featureContrib and trainsetBias families can decompose the prediction of regression/classification trees/forests into bias and feature contribution components.

The MDI and MDIoob families

The MDI family can calculate the good old MDI feature importance measure, which unfortunately has some feature selection bias. MDI-oob is a debiased MDI feature importance measure that has achieved state-of-the-art performance in feature selection for both simulated and real data. It can be calculated with functions from the MDIoob family.

Examples

1
2
3
4
library(ranger)
rfobj <- ranger(mpg ~ ., mtcars, keep.inbag = TRUE)
tidy.RF <- tidyRF(rfobj, mtcars[, -1], mtcars[, 1])
MDIoob(tidy.RF, mtcars[, -1], mtcars[, 1])

nalzok/tree.interpreter documentation built on Jan. 29, 2020, 5:48 p.m.