Description Usage Arguments Value Examples
Performs k-fold cross validation for rcDT model to select the best subtree from the set of optimally pruned subtree generated from 'prune' function.
1 2 3 4 5 6 | treeCV(dat, split.var, N0 = 20, n0 = 5, efficacy = "y", risk = "r",
col.trt = "trt", col.prtx = "prtx", lambda = 0, risk.control = FALSE,
risk.threshold = NA, nfolds = 10, AIPWE = FALSE, sort = TRUE,
ctgs = NA, stabilize.type = c("linear", "rf"), stabilize = TRUE,
use.other.nodes = TRUE, use.bootstrap = FALSE,
extremeRandomized = FALSE)
|
dat |
data.frame. Data used to construct rcDT model. Must contain efficacy variable (y), risk variable (r), binary treatment indicator coded as 0 / 1 (trt), propensity score (prtx), candidate splitting covariates. |
split.var |
numeric vector. Columns of spliting variables. |
N0 |
numeric specifying minimum number of observations required to call a node terminal. Defaults to 20. |
n0 |
numeric specifying minimum number of treatment/control observations needed in a split to declare a node terminal. Defaults to 5. |
efficacy |
char. Efficacy outcome column. Defaults to 'y'. |
risk |
char. Risk outcome column. Defaults to 'r'. |
col.trt |
char. Treatment indicator column name. Should be of form 0/1 or -1/+1. |
col.prtx |
char. Propensity score column name. |
lambda |
numeric. Penalty parameter for risk scores. Defaults to 0, i.e. no constraint. |
risk.control |
logical. Should risk be controlled? Defaults to TRUE. |
risk.threshold |
numeric. Desired level of risk control. |
AIPWE |
logical. Should AIPWE (TRUE) or IPWE (FALSE) be used. Not available yet. |
sort |
internal use. |
stabilize.type |
character specifying method used for estimating residuals. Current options are 'linear' for linear model (default) and 'rf' for random forest. |
stabilize |
logical indicating if efficacy should be modeled using residuals. Defaults to TRUE. |
use.other.nodes |
logical. Should global estimator of objective function be used. Defaults to TRUE. |
use.bootstrap |
logical. Should a bootstrap resampling be done? Defaults to FALSE. |
extremeRandomized |
logical. Experimental for randomly selecting cutpoints in a random forest model. Defaults to FALSE and users should change this at their own peril. #' @return A summary of the cross validation including optimal penalty parameter and the optimal model. |
test |
data.frame of testing observations. Should be formatted the same as 'data'. |
max.depth |
numeric specifying maximum depth of the tree. Defaults to 15 levels. |
mtry |
numeric specifying the number of randomly selected splitting variables to be included. Defaults to number of splitting variables. |
ctg |
numeric vector corresponding to the categorical input columns. Defaults to NULL. Not available yet. |
best.tree.size |
optimal rcDT model based on size |
best.tree.alpha |
optimal rcDT model based on alpha parameter selection |
best.alpha |
optimal lambda parameter selected from the cross validation procedure |
full.tree |
unpruned tree |
pruned.tree |
output from pruning of 'full.tree' |
data |
input data |
details |
summary of model performance |
subtrees |
sequence of optimally pruned subtrees of 'full.tree' |
in.train |
training samples from splits |
in.test |
testing samples from splits |
1 2 3 4 5 | # Grow large tree
set.seed(1)
dat <- generateData()
fit <- treeCV(dat, split.var = 1:10, nfolds = 5, lambda = 1,
risk.control = TRUE, risk.threshold = 2.75)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.