Description Usage Arguments Value Author(s) References See Also Examples
This method calculates feature contributions for a given dataset and an existing Random Forest model randomForest
. The feature contributions
are computed separately for each instance/record in dataset and provide detailed information about relationships
between variables and the predicted value.
This method was implemented based upon the approach of Kuz'min et al. for regression models and extended to classification models.
For a binary classification model the method returns the feature contributions towards class "one". For a multi-class model,
the feature contributions are calculated towards the class predicted by the randomForest
model for a given instance.
The method does not work for unsupervised models.
The randomForest
model must have a stored in-bag matrix that keeps track of
which samples were used to build trees in the forest and sampling without replacement must be used to generate a model.
Hence, all Random Forest models analyzed by this method must be generated as follows:
model <- randomForest(...,keep.inbag=TRUE,replace=FALSE)
The reason for this current limitation is because, in the code of the randomForest
implementation of Random Forest provided by Liaw and Wiener, the inbag
matrix does not record how many times a sample was used to build a particular tree (if sampling with replacement).
1 | featureContributions(object, lInc, dataT, mClass=NULL)
|
object |
an object of the class |
lInc |
local increments of feature contributions calculated for this object using |
dataT |
a data frame containing the variables in the model (columns) for all instances (rows) for which feature contributions are desired |
mClass |
a name of the class to which feature contributions is calculated. The class name must to match to the one class name from the randomForest
object variable |
A list with the following components:
contrib |
|
Anna Palczewska annawojak@gmail.com and
Richard Marchese Robinson rmarcheserobinson@gmail.com
V.E. Kuz'min et al. (2011), Interpretation of QSAR Models Based on Random Forest Methods, Molecular Informatics, 30, 593-603.
A. Palczewska et al. (2013), Interpreting random forest models using a feature contribution method, Proceedings of the 2013 IEEE 14th
International Conference on Information Reuse and Integration IEEE IRI 2013, August 14-16, 2013, San Francisco, California, USA, 112-119.
A. Palczewska et al. (2014), Interpreting random forest classification models using a feature contribution method. in Integration of Reusable Systems, ser. Advances in Intelligent and Soft Computing, T. Bouabana-Tebibel and S. H. Rubin, Eds. Springer International Publishing, 263, 193-218.
randomForest
, getLocalIncrements
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | #Multi-class Classification
library(randomForest)
data(iris)
rF <- randomForest(x=iris[,-5],y=as.factor(as.character(iris[,5])),
ntree=25,importance=TRUE, keep.inbag=TRUE,replace=FALSE)
#Get Local feature incremets
li<-getLocalIncrements(rF, iris[,-5])
#Calculate feature contributions
fc<-featureContributions(rF, li, iris[,-5])
## Not run:
#Binary classification
library(randomForest)
data(ames)
ames_train<-ames[ames$Type=="Train",-c(1,3, ncol(ames))]
rF_Model <- randomForest(x=ames_train[,-1],y=as.factor(as.character(ames_train[,1])),
ntree=500,importance=TRUE, keep.inbag=TRUE,replace=FALSE)
li <- getLocalIncrements(rF_Model,ames_train[,-1])
fc<-featureContributions(rF_Model, li, ames_train[,-1])
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.