internal C++ function to compute feature contributions for a random Forest

1 2 3 | ```
recTree(vars, obs, ntree, calculate_node_pred, X, Y, leftDaughter,
rightDaughter, nodestatus, xbestsplit, nodepred, bestvar,
inbag, varLevels, OOBtimes, localIncrements)
``` |

`vars` |
number of variables in X |

`obs` |
number of observations in X |

`ntree` |
number of trees starting from 1 function should iterate, cannot be higher than columns of inbag |

`calculate_node_pred` |
should the node predictions be recalculated(true) or reused from nodepred-matrix(false & regression) |

`X` |
X training matrix |

`Y` |
target vector, factor or regression |

`leftDaughter` |
a matrix from a the output of randomForest rfo$forest$leftDaughter the node.number/row.number of the leftDaughter in a given tree by column |

`rightDaughter` |
a matrix from a the output of randomForest rfo$forest$rightDaughter the node.number/row.number of the rightDaughter in a given tree by column |

`nodestatus` |
a matrix from a the output of randomForest rfo$forest$nodestatus the nodestatus of a given node in a given tree |

`xbestsplit` |
a matrix from a the output of randomForest rfo$forest$xbestsplit |

`nodepred` |
a matrix from a the output of randomForest rfo$forest$xbestsplit the inbag target average for regression mode and the majority target class for classification |

`bestvar` |
a matrix from a the output of randomForest rfo$forest$xbestsplit the inbag target average for regression mode and the majority target class for classification |

`inbag` |
a matrix from the output of randomForest rfo$inbag for regression |

`varLevels` |
the number of levels of all varibles, 1 for continous and multinomal, >1 forcategorical variables. This is needed for categorical variables to interpretate binary split from xbestsplit. |

`OOBtimes` |
number of times a certain observation was out of bag in the forest. Needed to compute feature contributions as they are the sum local increments over out-of-bag obseravations over features divided by the OOBtimes. In previous implementation featurecontributions is summed all observations and is divived by ntrees. |

`localIncrements` |
an empty matrix to store localIncrements during computation. In the end the localIncrement matrix will become the feature contributions. |

This is function is excuted by the function forestFloor.

This is a c++/Rcpp implementation computing feature contributions. The main differences from this implementation and the rfFC-package, is that these feature contributions is only summed over out-of-bag samples which give some kind of cross-validation. This implementation allows sample replacement but do not support more than binaray classification as rfFC do.

no output, the feature contributions are writtten directly to localIncrements input

Soren Havelund Welling

Interpretation of QSAR Models Based on Random Forest Methods, http://dx.doi.org/10.1002/minf.201000173

Interpreting random forest classification models using a feature contribution method, http://arxiv.org/abs/1312.1121

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | ```
## Not run:
rm(list=ls())
library(forestFloor)
#simulate data
obs=2500
vars = 6
X = data.frame(replicate(vars,rnorm(obs)))
Y = with(X, X1^2 + sin(X2*pi) + 2 * X3 * X4 + 1 * rnorm(obs))
#grow a forest, remeber to include inbag
rfo=randomForest(X,Y,keep.inbag = TRUE,sampsize=1500,ntree=500)
#compute topology, Rectree is excuted within forestFloor.
#See source-code of forestFloor function to for more details.
ff = forestFloor(rfo,X)
#print forestFloor
print(ff)
#plot partial functions of most important variables first
plot(ff)
## End(Not run)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.