Description Usage Arguments Value Author(s) See Also
Using repeated calls to iRF::randomForest
, this function
iteratively grows weighted ensembles of decision trees. Optionally,
return stable feature interactions for select iterations by analyzing
feature usage on decision paths of large leaf nodes. For details on the iRF
algorithm, see https://arxiv.org/abs/1706.08457.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | iRF(x, y, xtest=NULL, ytest=NULL,
n.iter=5,
ntree=500,
n.core=1,
mtry.select.prob = rep(1/ncol(x), ncol(x)),
keep.impvar.quantile=NULL,
interactions.return=NULL,
wt.pred.accuracy=FALSE,
cutoff.unimp.feature = 0,
rit.param=list(depth=5, ntree=100, nchild=2,
class.id=1, class.cut=NULL),
varnames.grp=NULL,
n.bootstrap=30,
bootstrap.forest=TRUE,
verbose=TRUE,
...
)
|
x, xtest |
numeric matrices of predictors |
y, ytest |
response vectors |
n.iter |
number of weighted random forest iterations |
ntree |
number of trees to grow in each iteration |
n.core |
number of cores across which tree growing and reading should be distributed |
mtry.select.prob |
initial weights specified for first random forest fit, defaults to equal weights |
keep.impvar.quantile |
a nonnegative fraction q. If provided, all the variables with Gini importance in the top 100*q percentile are retained during random splitting variable selection in the next iteration |
interactions.return |
a numeric vector specifying which iterations to
evaluate interactions for. Note: interaction computation is
computationally intensive particularly when |
wt.pred.accuracy |
Should leaf nodes be sampled proportional to both size
and accuracy ( |
cutoff.unimp.feature |
a non-negative fraction r. If provided, only features with Gini importance score in the top 100*(1-r) percentile are used to find feature interactions |
rit.param |
a named list, containing entries to specify
|
varnames.grp |
If features can be grouped based on their demographics or correlation patterns, use the group of features or “hyper-feature”s to conduct random intersection trees |
n.bootstrap |
Number of bootstraps replicates used to calculate stability scores of interactiosn obtained by RIT |
bootstrap.forest |
Should a new Random Forest be constructed for each bootstrap sample to evaluate stability? Setting to FALSE results in faster runtime. |
verbose |
Display progress messages and intermediate outputs on screen? |
... |
additional arguments passed to iRF::randomForest |
A list containing the following items:
rf.list |
A list of n.iter objects of the class randomForest |
interaction |
A list of length n.iter. Each element of the list contains a named numeric vector of stability scores, where the names are candidate interactions (feature names separated by "_"), defined as frequently appearing features and feature combinations on the decision paths of large leaf nodes |
Sumanta Basu sumbose@berkeley.edu, Karl Kumbier kkumbier@berkeley.edu
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.