rcDT: Constructs an rcDT model
In kdoub5ha/rcITR: Risk Controlled ITR Discovery

Description Usage Arguments Value Examples

Constructs a risk controlled decision tree (rcDT) given an efficacy and risk outcome.

rcDT(data, split.var, test = NULL, ctg = NULL, efficacy = "y",
  risk = "r", col.trt = "trt", col.prtx = "prtx", risk.control = TRUE,
  risk.threshold = NA, lambda = 0, min.ndsz = 20, n0 = 5,
  stabilize = TRUE, stabilize.type = c("linear", "rf"),
  use.other.nodes = TRUE, mtry = length(split.var), max.depth = 15,
  AIPWE = FALSE, extremeRandomized = FALSE, print.summary = TRUE)

`data`	data.frame. Data used to construct rcDT model. Must contain efficacy variable (y), risk variable (r), binary treatment indicator coded as 0 / 1 (trt), propensity score (prtx), candidate splitting covariates (split.var).
`split.var`	numeric vector. Columns of spliting variables.
`test`	data.frame of testing observations. Should be formatted the same as 'data'.
`ctg`	numeric vector corresponding to the categorical input columns. Defaults to NULL. Not available yet.
`efficacy`	char. Efficacy outcome column. Assumes larger values are desirable Defaults to 'y'.
`risk`	char. Risk outcome column. Assumes smaller values are desirable Defaults to 'r'.
`col.trt`	char. Treatment indicator column name. Should be of form 0/1 or -1/+1.
`col.prtx`	char. Propensity score column name.
`risk.control`	logical. Should risk be controlled? Defaults to TRUE.
`risk.threshold`	numeric. Desired level of risk control.
`lambda`	numeric. Penalty parameter for risk scores. Defaults to 0, i.e. no constraint. Optional arguments
`min.ndsz`	numeric specifying minimum number of observations required to call a node terminal. Defaults to 20.
`n0`	numeric specifying minimum number of treatment/control observations needed in a split to declare a node terminal. Defaults to 5.
`stabilize`	logical indicating if efficacy should be modeled using residuals. Defaults to TRUE.
`stabilize.type`	character specifying method used for estimating residuals. Current options are 'linear' for linear model (default) and 'rf' for random forest.
`use.other.nodes`	logical. Should global estimator of objective function be used. Defaults to TRUE.
`mtry`	numeric specifying the number of randomly selected splitting variables to be included. Defaults to number of splitting variables.
`max.depth`	numeric specifying maximum depth of the tree. Defaults to 15 levels.
`AIPWE`	logical. Should AIPWE (TRUE) or IPWE (FALSE) be used. Not available yet.
`extremeRandomized`	logical. Experimental for randomly selecting cutpoints in a random forest model. Defaults to FALSE and users should change this at their own peril.
`print.summary`	logical. Should a summary of the tree building be printed? Defaults to TRUE for single trees.

Summary of rcDT model

`tree`	data.frame with the following: Each 'node' begins with "0" indicating the root node, followed by a "1" or "2" indicating the less than (or left) child node or greater than (or right) child node. Additionally, the number of observations 'size', number treated 'n.1', number on control 'n.0', and treatment effect 'trt.effect' summaries are provided. The splitting information includes the column of the chosen splitting variable ‘var', the variable name ’vname', the direction the treatment is sent 'cut.1' ("r" for right child node, and "l" for left), the chosen split value 'cut.2', and the estimated value function 'score'.
`y`	efficacy values used in modeling. Will likely differ from original input 'y' if stabilization was used
`risk.threshold`	value of risk control used
`data`	input dataset
`fit.y`	fitted model for residuals is 'stabilize' was used
`split.var`	splitting covariates used

set.seed(123)
dat <- generateData()
# Generates tree using simualated EMR data with splitting variables located in columns 1-4.
tree <- rcDT(data = dat, 
             split.var = 1:10, 
             risk.threshold = 2.75, 
             lambda = 1)