Description Usage Arguments Details Value Author(s) References
internal C++ functions to compute feature contributions for a random Forest
1 2 3 4 5 6 7 | recTree( vars, obs, ntree, calculate_node_pred, X,Y,majorityTerminal, leftDaughter,
rightDaughter, nodestatus, xbestsplit, nodepred, bestvar,
inbag, varLevels, OOBtimes, localIncrements)
multiTree(vars, obs, ntree, nClasses, X,Y,majorityTerminal, leftDaughter,
rightDaughter, nodestatus, xbestsplit, nodepred, bestvar,
inbag, varLevels, OOBtimes, localIncrements)
|
vars |
number of variables in X |
obs |
number of observations in X |
ntree |
number of trees starting from 1 function should iterate, cannot be higher than columns of inbag |
nClasses |
number of classes in classification forest |
calculate_node_pred |
should the node predictions be recalculated(true) or reused from nodepred-matrix(false & regression) |
X |
X training matrix |
Y |
target vector, factor or regression |
majorityTerminal |
bool, majority vote in terminal nodes? Default is FALSE for regression. Set only to TRUE when binary_reg=TRUE. |
leftDaughter |
a matrix from a the output of randomForest rf$forest$leftDaughter the node.number/row.number of the leftDaughter in a given tree by column |
rightDaughter |
a matrix from a the output of randomForest rf$forest$rightDaughter the node.number/row.number of the rightDaughter in a given tree by column |
nodestatus |
a matrix from a the output of randomForest rf$forest$nodestatus the nodestatus of a given node in a given tree |
xbestsplit |
a matrix from a the output of randomForest rf$forest$xbestsplit. The split point of numeric variables or the binary split of categorical variables. See help file of randomForest::getTree for details of binary expansion for categorical splits. |
nodepred |
a matrix from a the output of randomForest rf$forest$xbestsplit. The inbag target average for regression mode and the majority target class for classification |
bestvar |
a matrix from a the output of randomForest rf$forest$xbestsplit the inbag target average for regression mode and the majority target class for classification |
inbag |
a matrix as the output of randomForest rf$inbag. Contain counts of how many times a sample was selected for a given tree. |
varLevels |
the number of levels of all variables, 1 for continuous or discrete, >1 for categorical variables. This is needed for categorical variables to interpret binary split from xbestsplit. |
OOBtimes |
number of times a certain observation was out-of-bag in the forest. Needed to compute cross-validated feature contributions as these are summed local increments over out-of-bag observations over features divided by this number. In previous implementation(rfFC), articles(see references) feature contributions are summed by all observations and is divived by ntrees. |
localIncrements |
an empty matrix to store localIncrements during computation. As C++ function returns, the input localIncrement matrix contains the feature contributions. |
This is function is excuted by the function forestFloor. This is a c++/Rcpp implementation computing feature contributions. The main differences from this implementation and the rfFC-package(Rforge), is that these feature contributions are only summed over out-of-bag samples yields a cross-validation. This implementation allows sample replacement, binary and multi-classification.
no output, the feature contributions are writtten directly to localIncrements input
Soren Havelund Welling
Interpretation of QSAR Models Based on Random Forest Methods, http://dx.doi.org/10.1002/minf.201000173
Interpreting random forest classification models using a feature contribution method, http://arxiv.org/abs/1312.1121
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.