veb_boost_stumps: Wrapper for using VEB-Boost with the SER prior w/ stumps...

View source: R/veb_boost.R

veb_boost_stumpsR Documentation

Wrapper for using VEB-Boost with the SER prior w/ stumps and/or linear terms

Description

Wrapper for using VEB-Boost with the SER prior w/ stumps and/or linear terms

Usage

veb_boost_stumps(
  X,
  Y,
  X_test = NULL,
  learners = NULL,
  include_linear = NULL,
  include_stumps = NULL,
  num_cuts = ceiling(min(NROW(Y)/5, max(100, sqrt(NROW(Y))))),
  k = 1,
  d = 1,
  use_quants = TRUE,
  scale_X = c("sd", "max", "NA"),
  growMode = c("+*", "+", "*", "NA"),
  changeToConstant = FALSE,
  max_log_prior_var = 0,
  use_optim = TRUE,
  lin_prior_prob = 0.5,
  reverse_learners = FALSE,
  nthreads = ceiling(parallel::detectCores(logical = TRUE)/2),
  ...
)

Arguments

X

An (n x p) numeric matrix to be used as the predictors (currently, this wrapper forces all nodes to use the same X)

Y

is a numeric vector response

X_test

is an optional (m X p) matrix to be used as the testing data. Posterior mean response is saved in the output's field $pred_mu1

learners

is a list of other learners to be used in veb_boost

include_linear

is a logical of length 1 or p specifying which columns of X we should include as linear terms. If the length is 1, this value gets recycled for all columns of X. If NULL is supplied, then all valid linear terms are used.

include_stumps

is a logical of length 1 or p specifying which columns of X we should include as stump terms If the length is 1, this value gets recycled for all columns of X. If NULL is supplied, then all valid stumps terms are used.

num_cuts

is a whole number of length 1 or p specifying how many cuts to make when making the stumps terms. If the length is 1, this value gets recycled for all columns of X. For entries corresponding to the indices where include_stumps is FALSE, these values are ignored. We use the quantiles from each predictor when making the stumps splits, using num_cuts of them. If num_cuts = Inf, then all values of the variables are used as split points.

k

is an integer, or a vector of integers of length length(learners), for how many terms are in the sum of nodes (for each learner)

d

is either an integer, or an integer vector of length k, or a list of integer vectors of length length(learners) (each element either an integer, or a vector of length k) for the multiplicative depth of each of the k terms NOTE: This can be dangerous. For example, if the fit starts out too large, then entire branhces will be fit to be exactly zero. When this happens, we end up dividing by 0 in places, and this results in NAs, -Inf, etc. USE AT YOUR OWN RISK

use_quants

is a logical for if the cut-points should be based off of the quantiles ('use_quants = TRUE'), or if the cut points should be evenly spaced in the range of the variable ('use_quants = FALSE').

scale_X

is a string for if/how the columns of X should be scaled. 'sd' scales by the standard deviations of the variables. 'max' scales by the maximum absolute value (so variables are on the [-1, +1] scale). 'NA' performs no scaling.

growMode

is a string for if the learner should be grown (or not) If "+*", we grow mu_0 -> (mu_0 * mu_2) + mu_1 If "+", we grow mu_0 -> (mu_0 + mu_1) If "*", we grow mu_0 -> (mu_0 * mu_1) (NOTE: Not recommended if we start with k = 1) If "NA", we do not grow this learner

changeToConstant

is a logical for if constant fits should be changed to be constant

max_log_prior_var

is a scalar for the maximum that the estimated log-prior variance for each weak learner can be. The idea is that setting this to be small limits the "size" of each weak learner, similar-ish to the learning rate in boosting. The maximum allowed value is 35, which essentially allows the unrestricted MLE to be estimated. The minimum allowed value is -5.

use_optim

is a logical. If TRUE, then the prior variance is optimized using the Brent method. If FALSE, then a single EM step is taken to optimize over V

lin_prior_prob

is a number between 0 and 1 that gives the prior probability that the effect variable is a linear term. This means that (1 - lin_prior_prob) is the probability that the effect variable is a stump term. Within linear terms and stump terms, all variables have the same prior variance.

reverse_learners

is a logical for if the order of learners should be reversed. If FALSE, all additional learners in 'learners' will will come first. If TRUE, the stumps learners will be first.

...

Other arguments to be passed to veb_boost

Details

This function performs VEB-Boosting, where the prior to be used is the SER prior, and our predictors are either i) the linear terms of X; ii) the stumps made from the columns of X; or iii) Both (i) and (ii)

Value

A VEB_Boost_Node object with the fit

Examples

set.seed(1)
n = 1000
p = 1000
X = matrix(runif(n * p), nrow = n, ncol = p)
Y = rnorm(n, 5*sin(3*X[, 1]) + 2*(X[, 2]^2) + 3*X[, 3]*X[, 4])
veb.stumps.fit = veb_boost_stumps(X, Y, include_linear = TRUE, family = "gaussian")



stephenslab/VEB.Boost documentation built on July 2, 2023, 1 p.m.