Using repeated calls to iRF::randomForest
, this function
iteratively grows weighted ensembles of decision trees. Optionally,
for every iteration, returns stable feature interactions by analyzing feature
usage on decision paths of large leaf nodes. Implemented only for
binary classification with numeric predictors and response taking values in 0,1.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16  iRF(x, y, xtest=NULL, ytest=NULL,
n.iter=5,
ntree=500,
n.core=1,
mtry.select.prob = rep(1/ncol(x), ncol(x)),
keep.impvar.quantile=NULL,
interactions.return=NULL,
wt.pred.accuracy=FALSE,
cutoff.unimp.feature = 0,
rit.param=list(depth=5, ntree=100, nchild=2,
class.cut=NULL, class.id=1),
varnames.grp=NULL,
n.bootstrap=30,
bootstrap.forest=TRUE,
verbose=TRUE, ...
)

x, xtest 
numeric matrices of predictors 
y, ytest 
factor with two levels: 0, 1 
n.iter 
number of weighted random forest fits 
ntree 
number of trees to grow in each iteration 
n.core 
number of cores across which tree growing should be distributed 
mtry.select.prob 
initial weights specified for first random forest fit, defaults to equal weights 
keep.impvar.quantile 
a nonnegative fraction q. If provided, all the variables with Gini importance in the top 100*q percentile are retained during random splitting variable selection in the next iteration 
interactions.return 
a numeric vector specifying which iterations to
calculate interactions for. Note: interaction computation is
computationally intensive particularly when 
wt.pred.accuracy 
Should leaf nodes be sampled proportional to both size and decrease in variabiliy of responses? 
cutoff.unimp.feature 
a nonnegative fraction r. If provided, only features with Gini importance score in the top 100*(1r) percentile are used to find feature interactions 
class.id 
which class of observations will be used to find classspecific interaction? Choose between 0 or 1. Default is set to 1. 
rit.param 
a named list, containing entries to specify

class.id
which class of observations will be used to find
classspecific interaction? Choose between 0 or 1. Default is set to 1.
Ignored if regression forest.
varnames.grp 
If features can be grouped based on their demographics or correlation patterns, use the group of features or “hyperfeature”s to conduct random intersection trees 
n.bootstrap 
Number of bootstraps replicates used to calculate stability scores of interactiosn obtained by RIT 
bootstrap.forest 
Should a new Random Forest be constructed for each bootstrap sample to evaluate stability? Setting to FALSE results in faster runtime. 
verbose 
Display progress messages and intermediate outputs on screen? 
... 
additional arguments passed to iRF::randomForest 
A list containing the following items:
rf.list 
A list of n.iter objects of the class randomForest 
interaction 
A list of length n.iter. Each element of the list contains a named numeric vector of stability scores, where the names are candidate interactions (feature names separated by "_"), defined as frequently appearing features and feature combinations on the decision paths of large leaf nodes 
