rcRF: Constructs rcRF model
In kdoub5ha/rcITR: Risk Controlled ITR Discovery

Description Usage Arguments Value Examples

Constructs a risk controlled random forest (rcRF) composed of rcDT predictors.

rcRF(data, split.var, efficacy = "y", risk = "r", col.trt = "trt",
  col.prtx = "prtx", risk.control = TRUE, risk.threshold = NA,
  lambda = 0, stabilize = TRUE, stabilize.type = c("linear", "rf"),
  test = NULL, ctg = NULL, N0 = 20, n0 = 5, max.depth = 10,
  ntree = 500, mtry = max(floor(length(split.var)/3), 1),
  avoid.nul.tree = FALSE, AIPWE = FALSE, verbose = FALSE,
  use.other.nodes = TRUE, extremeRandomized = FALSE, importance = FALSE,
  order.importances = TRUE)

`data`	data.frame. Data used to construct rcRF model. Must contain efficacy variable (y), risk variable (r), binary treatment indicator coded as 0 / 1 (trt), propensity score (prtx), candidate splitting covariates.
`split.var`	numeric vector. Columns of spliting variables.
`efficacy`	char. Efficacy outcome column. Defaults to 'y'.
`risk`	char. Risk outcome column. Defaults to 'r'.
`col.trt`	char. Treatment column name
`risk.control`	logical. Should risk be controlled? Defaults to TRUE.
`risk.threshold`	numeric. Desired level of risk control.
`lambda`	numeric. Penalty parameter for risk scores. Defaults to 0, i.e. no constraint. Optional arguments
`stabilize`	logical indicating if efficacy should be modeled using residuals. Defaults to TRUE.
`stabilize.type`	character specifying method used for estimating residuals. Current options are 'linear' for linear model (default) and 'rf' for random forest.
`test`	data.frame of testing observations. Should be formatted the same as 'data'.
`ctg`	numeric vector corresponding to the categorical input columns. Defaults to NULL. Not available yet.
`N0`	numeric specifying minimum number of observations required to call a node terminal. Defaults to 20.
`n0`	numeric specifying minimum number of treatment/control observations needed in a split to declare a node terminal. Defaults to 5.
`max.depth`	numeric specifying maximum depth of the tree. Defaults to 15 levels.
`ntree`	numeric. Number of trees generated. Defaults to 500.
`mtry`	numeric specifying the number of randomly selected splitting variables to be included. Defaults to larger of 1 and length(split.var)/3.
`avoid.nul.tree`	logical. Should null trees be discarded?
`AIPWE`	logical. Should AIPWE (TRUE) or IPWE (FALSE) be used. Not available yet.
`verbose`	logical. Give updates about forest progression?
`use.other.nodes`	logical. Should global estimator of objective function be used. Defaults to TRUE.
`extremeRandomized`	logical. Experimental for randomly selecting cutpoints in a random forest model. Defaults to FALSE and users should change this at their own peril.
`importance`	logical. Indicated if variable importance measures should be estimated and returned. Defaults to FALSE.
`order.importances`	logical. Should importances be ordered (if requested)?
`col.ptrx`	char. Propensity score column name.

List of rcRF outputs

`ID.Boots.Samples`	list of bootstrap sample IDs
`TREES`	list of trees
`Model.Specification`	information about the input parameters of the forest
`...`	Summaries for in and out of bag samples

set.seed(123)
dat <- generateData()
# Generates rcRF model using simualated data with splitting variables located in columns 1-10.
fit <- rcRF(data = dat, 
            split.var = 1:10, 
            ntree = 200,
            risk.threshold = 2.75, 
            lambda = 1)