s_LINAD: Linear Additive Tree (C, R)
In egenn/rtemis: Machine Learning and Visualization

s_LINAD

R Documentation

Linear Additive Tree (C, R)

Description

Train a Linear Additive Tree for Regression or Binary Classification

Usage

s_LINAD(
  x,
  y = NULL,
  x.test = NULL,
  y.test = NULL,
  weights = NULL,
  max.leaves = 20,
  lookback = TRUE,
  force.max.leaves = NULL,
  learning.rate = 0.5,
  ifw = TRUE,
  ifw.type = 1,
  upsample = FALSE,
  downsample = FALSE,
  resample.seed = NULL,
  leaf.model = c("line", "spline"),
  gamlearner = "gamsel",
  gam.params = list(),
  nvmax = 3,
  gamma = 0.5,
  gamma.on.lin = FALSE,
  lin.type = c("glmnet", "forwardStepwise", "cv.glmnet", "lm.ridge", "allSubsets",
    "backwardStepwise", "glm", "solve", "none"),
  first.lin.type = "cv.glmnet",
  first.lin.learning.rate = 1,
  first.lin.alpha = 1,
  first.lin.lambda = NULL,
  cv.glmnet.nfolds = 5,
  which.cv.glmnet.lambda = "lambda.min",
  alpha = 1,
  lambda = 0.05,
  lambda.seq = NULL,
  minobsinnode.lin = 10,
  part.minsplit = 2,
  part.xval = 0,
  part.max.depth = 1,
  part.cp = 0,
  part.minbucket = 1,
  .rho = TRUE,
  rho.max = 1000,
  init = NULL,
  metric = "auto",
  maximize = NULL,
  grid.resample.params = setup.resample("kfold", 5),
  gridsearch.type = "exhaustive",
  save.gridrun = FALSE,
  select.leaves.smooth = FALSE,
  cluster = FALSE,
  keep.x = FALSE,
  simplify = TRUE,
  cxrcoef = FALSE,
  n.cores = rtCores,
  .preprocess = NULL,
  verbose = TRUE,
  grid.verbose = FALSE,
  plot.tuning = FALSE,
  verbose.predict = FALSE,
  trace = 1,
  x.name = NULL,
  y.name = NULL,
  question = NULL,
  outdir = NULL,
  print.plot = FALSE,
  plot.fitted = NULL,
  plot.predicted = NULL,
  plot.theme = rtTheme,
  save.mod = FALSE,
  .gs = FALSE
)

Arguments

`x`	Numeric vector or matrix / data frame of features i.e. independent variables
`y`	Numeric vector of outcome, i.e. dependent variable
`x.test`	Numeric vector or matrix / data frame of testing set features Columns must correspond to columns in `x`
`y.test`	Numeric vector of testing set outcome
`weights`	Numeric vector: Weights for cases. For classification, `weights` takes precedence over `ifw`, therefore set `weights = NULL` if using `ifw`. Note: If `weight` are provided, `ifw` is not used. Leave NULL if setting `ifw = TRUE`.
`max.leaves`	Integer: Maximum number of terminal nodes to grow. Setting this to a value > 1, triggers cross-validation to find best number of leaves. To force a given number of leaves and not cross-validate, set `force.max.leaves` to any (integer) value.
`lookback`	Logical: If TRUE, use validation error to decide best number of leaves to use.
`force.max.leaves`	Integer: If set, `max.leaves` is ignored and the tree will attempt to reach this number of leaves, without performing tuning number of leaves.
`learning.rate`	[gS] Numeric: learning rate for steps after initial linear model
`ifw`	Logical: If TRUE, apply inverse frequency weighting (for Classification only). Note: If `weights` are provided, `ifw` is not used.
`ifw.type`	Integer 0, 1, 2 1: class.weights as in 0, divided by min(class.weights) 2: class.weights as in 0, divided by max(class.weights)
`upsample`	Logical: If TRUE, upsample cases to balance outcome classes (for Classification only) Note: upsample will randomly sample with replacement if the length of the majority class is more than double the length of the class you are upsampling, thereby introducing randomness
`downsample`	Logical: If TRUE, downsample majority class to match size of minority class
`resample.seed`	Integer: If provided, will be used to set the seed during upsampling. Default = NULL (random seed)
`nvmax`	[gS] Integer: Number of max features to use for lin.type "allSubsets", "forwardStepwise", or "backwardStepwise". If values greater than n of features in `x` are provided, they will be excluded
`gamma`	[gS] Numeric: Soft weighting parameter. Weights of cases that do not belong to node get multiplied by this amount
`lin.type`	Character: One of "glmnet", "forwardStepwise", "cv.glmnet", "lm.ridge", "allSubsets", "backwardStepwise", "glm", "solve", or "none" to not fit linear models See lincoef for more
`first.lin.type`	Character: same options as `lin.type`, the first linear model to fit on the root node.
`first.lin.alpha`	Numeric: alpha for the first linear model, if `first.lin.type` is "glmnet" or "cv.glmnet"
`lambda`	[gS] Numeric: lambda value for lin.type `glmnet`, `cv.glmnet`, `lm.ridge`
`minobsinnode.lin`	[gS] Integer: Minimum number of observation needed to fit linear model
`part.minsplit`	[gS] Integer: Minimum number of observations in node to consider splitting
`part.max.depth`	Integer: Max depth for each tree model within the additive tree
`part.cp`	[gS] Numeric: Split must decrease complexity but at least this much to be considered
`part.minbucket`	[gS] Integer: Minimum number of observations allowed in child node to allow splitting
`init`	Initial value. Default = `mean(y)`
`verbose`	Logical: If TRUE, print summary to screen.
`plot.tuning`	Logical: If TRUE, plot validation error during gridsearch
`trace`	Integer: If higher than 0, will print more information to the console.
`x.name`	Character: Name for feature set
`y.name`	Character: Name for outcome
`question`	Character: the question you are attempting to answer with this model, in plain language.
`outdir`	Path to output directory. If defined, will save Predicted vs. True plot, if available, as well as full model output, if `save.mod` is TRUE
`print.plot`	Logical: if TRUE, produce plot using `mplot3` Takes precedence over `plot.fitted` and `plot.predicted`.
`plot.fitted`	Logical: if TRUE, plot True (y) vs Fitted
`plot.predicted`	Logical: if TRUE, plot True (y.test) vs Predicted. Requires `x.test` and `y.test`
`plot.theme`	Character: "zero", "dark", "box", "darkbox"
`save.mod`	Logical: If TRUE, save all output to an RDS file in `outdir` `save.mod` is TRUE by default if an `outdir` is defined. If set to TRUE, and no `outdir` is defined, outdir defaults to `paste0("./s.", mod.name)`
`.gs`	internal use only

Details

The Linear Additive Tree trains a tree using a sequence of regularized linear models and splits. We specify an upper threshold of leaves using max.leaves instead of directly defining a number, because depending on the other parameters and the datasets, splitting may stop early.

[gS] indicates tunable hyperparameters that can accept a vector of possible values