rcRF: Constructs rcRF model

Description Usage Arguments Value Examples

View source: R/rcRF.R

Description

Constructs a risk controlled random forest (rcRF) composed of rcDT predictors.

Usage

1
2
3
4
5
6
7
8
rcRF(data, split.var, efficacy = "y", risk = "r", col.trt = "trt",
  col.prtx = "prtx", risk.control = TRUE, risk.threshold = NA,
  lambda = 0, stabilize = TRUE, stabilize.type = c("linear", "rf"),
  test = NULL, ctg = NULL, N0 = 20, n0 = 5, max.depth = 10,
  ntree = 500, mtry = max(floor(length(split.var)/3), 1),
  avoid.nul.tree = FALSE, AIPWE = FALSE, verbose = FALSE,
  use.other.nodes = TRUE, extremeRandomized = FALSE, importance = FALSE,
  order.importances = TRUE)

Arguments

data

data.frame. Data used to construct rcRF model. Must contain efficacy variable (y), risk variable (r), binary treatment indicator coded as 0 / 1 (trt), propensity score (prtx), candidate splitting covariates.

split.var

numeric vector. Columns of spliting variables.

efficacy

char. Efficacy outcome column. Defaults to 'y'.

risk

char. Risk outcome column. Defaults to 'r'.

col.trt

char. Treatment column name

risk.control

logical. Should risk be controlled? Defaults to TRUE.

risk.threshold

numeric. Desired level of risk control.

lambda

numeric. Penalty parameter for risk scores. Defaults to 0, i.e. no constraint.

Optional arguments

stabilize

logical indicating if efficacy should be modeled using residuals. Defaults to TRUE.

stabilize.type

character specifying method used for estimating residuals. Current options are 'linear' for linear model (default) and 'rf' for random forest.

test

data.frame of testing observations. Should be formatted the same as 'data'.

ctg

numeric vector corresponding to the categorical input columns. Defaults to NULL. Not available yet.

N0

numeric specifying minimum number of observations required to call a node terminal. Defaults to 20.

n0

numeric specifying minimum number of treatment/control observations needed in a split to declare a node terminal. Defaults to 5.

max.depth

numeric specifying maximum depth of the tree. Defaults to 15 levels.

ntree

numeric. Number of trees generated. Defaults to 500.

mtry

numeric specifying the number of randomly selected splitting variables to be included. Defaults to larger of 1 and length(split.var)/3.

avoid.nul.tree

logical. Should null trees be discarded?

AIPWE

logical. Should AIPWE (TRUE) or IPWE (FALSE) be used. Not available yet.

verbose

logical. Give updates about forest progression?

use.other.nodes

logical. Should global estimator of objective function be used. Defaults to TRUE.

extremeRandomized

logical. Experimental for randomly selecting cutpoints in a random forest model. Defaults to FALSE and users should change this at their own peril.

importance

logical. Indicated if variable importance measures should be estimated and returned. Defaults to FALSE.

order.importances

logical. Should importances be ordered (if requested)?

col.ptrx

char. Propensity score column name.

Value

List of rcRF outputs

ID.Boots.Samples

list of bootstrap sample IDs

TREES

list of trees

Model.Specification

information about the input parameters of the forest

...

Summaries for in and out of bag samples

Examples

1
2
3
4
5
6
7
8
9
set.seed(123)
dat <- generateData()
# Generates rcRF model using simualated data with splitting variables located in columns 1-10.
fit <- rcRF(data = dat, 
            split.var = 1:10, 
            ntree = 200,
            risk.threshold = 2.75, 
            lambda = 1)
            

kdoub5ha/rcITR documentation built on Aug. 5, 2020, 9:05 p.m.