s_RFSRC: Random Forest for Classification, Regression, and Survival...

View source: R/s_RFSRC.R

s_RFSRCR Documentation

Random Forest for Classification, Regression, and Survival [C, R, S]

Description

Train a Random Forest for Regression, Classification, or Survival Regression using randomForestSRC

Usage

s_RFSRC(
  x,
  y = NULL,
  x.test = NULL,
  y.test = NULL,
  x.name = NULL,
  y.name = NULL,
  n.trees = 1000,
  weights = NULL,
  ifw = TRUE,
  ifw.type = 2,
  upsample = FALSE,
  downsample = FALSE,
  resample.seed = NULL,
  bootstrap = "by.root",
  mtry = NULL,
  importance = TRUE,
  proximity = TRUE,
  nodesize = if (!is.null(y) && !is.factor(y)) 5 else 1,
  nodedepth = NULL,
  na.action = "na.impute",
  trace = FALSE,
  print.plot = FALSE,
  plot.fitted = NULL,
  plot.predicted = NULL,
  plot.theme = rtTheme,
  question = NULL,
  verbose = TRUE,
  outdir = NULL,
  save.mod = ifelse(!is.null(outdir), TRUE, FALSE),
  ...
)

Arguments

x

Numeric vector or matrix of features, i.e. independent variables

y

Numeric vector of outcome, i.e. dependent variable

x.test

(Optional) Numeric vector or matrix of validation set features must have set of columns as x

y.test

(Optional) Numeric vector of validation set outcomes

x.name

Character: Name for feature set

y.name

Character: Name for outcome

n.trees

Integer: Number of trees to grow. The more the merrier.

weights

Numeric vector: Weights for cases. For classification, weights takes precedence over ifw, therefore set weights = NULL if using ifw. Note: If weight are provided, ifw is not used. Leave NULL if setting ifw = TRUE.

ifw

Logical: If TRUE, apply inverse frequency weighting (for Classification only). Note: If weights are provided, ifw is not used.

ifw.type

Integer 0, 1, 2 1: class.weights as in 0, divided by min(class.weights) 2: class.weights as in 0, divided by max(class.weights)

upsample

Logical: If TRUE, upsample cases to balance outcome classes (for Classification only) Note: upsample will randomly sample with replacement if the length of the majority class is more than double the length of the class you are upsampling, thereby introducing randomness

downsample

Logical: If TRUE, downsample majority class to match size of minority class

resample.seed

Integer: If provided, will be used to set the seed during upsampling. Default = NULL (random seed)

bootstrap

Character:

mtry

Integer: Number of features sampled randomly at each split

importance

Logical: If TRUE, calculate variable importance.

proximity

Character or Logical: "inbag", "oob", "all", TRUE, or FALSE; passed to randomForestSRC::rfsrc

nodesize

Integer: Minimum size of terminal nodes.

nodedepth

Integer: Maximum tree depth.

trace

Integer: Number of seconds between messages to the console.

print.plot

Logical: if TRUE, produce plot using mplot3 Takes precedence over plot.fitted and plot.predicted.

plot.fitted

Logical: if TRUE, plot True (y) vs Fitted

plot.predicted

Logical: if TRUE, plot True (y.test) vs Predicted. Requires x.test and y.test

plot.theme

Character: "zero", "dark", "box", "darkbox"

question

Character: the question you are attempting to answer with this model, in plain language.

verbose

Logical: If TRUE, print summary to screen.

outdir

Optional. Path to directory to save output

save.mod

Logical: If TRUE, save all output to an RDS file in outdir save.mod is TRUE by default if an outdir is defined. If set to TRUE, and no outdir is defined, outdir defaults to paste0("./s.", mod.name)

...

Additional arguments to be passed to randomForestSRC::rfsrc

Details

For Survival Regression, y must be an object of type Surv, created using survival::Surv(time, status) mtry is the only tunable parameter, but it usually only makes a small difference and is often not tuned.

Value

Object of class rtMod

Author(s)

E.D. Gennatas

See Also

train_cv for external cross-validation

Other Supervised Learning: s_AdaBoost(), s_AddTree(), s_BART(), s_BRUTO(), s_BayesGLM(), s_C50(), s_CART(), s_CTree(), s_EVTree(), s_GAM(), s_GAM.default(), s_GAM.formula(), s_GBM(), s_GLM(), s_GLMNET(), s_GLMTree(), s_GLS(), s_H2ODL(), s_H2OGBM(), s_H2ORF(), s_HAL(), s_KNN(), s_LDA(), s_LM(), s_LMTree(), s_LightCART(), s_LightGBM(), s_MARS(), s_MLRF(), s_NBayes(), s_NLA(), s_NLS(), s_NW(), s_PPR(), s_PolyMARS(), s_QDA(), s_QRNN(), s_RF(), s_Ranger(), s_SDA(), s_SGD(), s_SPLS(), s_SVM(), s_TFN(), s_XGBoost(), s_XRF()

Other Tree-based methods: s_AdaBoost(), s_AddTree(), s_BART(), s_C50(), s_CART(), s_CTree(), s_EVTree(), s_GBM(), s_GLMTree(), s_H2OGBM(), s_H2ORF(), s_LMTree(), s_LightCART(), s_LightGBM(), s_MLRF(), s_RF(), s_Ranger(), s_XGBoost(), s_XRF()


egenn/rtemis documentation built on May 4, 2024, 7:40 p.m.