veb_boost: Performs VEB-Boosting

View source: R/veb_boost.R

veb_boostR Documentation

Performs VEB-Boosting

Description

Solves the VEB-Boost regression problem using the supplied inputs

Usage

veb_boost(
  learners,
  Y,
  k = 1,
  d = 1,
  sigma2 = NULL,
  family = c("gaussian", "binomial", "multinomial.bouchard", "multinomial.titsias",
    "negative.binomial", "poisson.log1pexp", "aft.loglogistic", "ordinal.logistic"),
  weights = 1,
  scaleWeights = TRUE,
  exposure = NULL,
  tol = NROW(Y)/10000,
  verbose = TRUE,
  maxit = Inf,
  backfit = FALSE
)

Arguments

learners

is either a single "learner" object, or a list of k "learner" objects A learner object is comprised of: 1. a fit function $fitFunction: (X, Y, sigma2, currentFit) -> newFit (where a fit is a list that must contain $mu1, $mu2, and $KL_div) 2. a prediction function $predFunction: (X, fit, moment) -> posterior moment (1 or 2) 3. a constant check function $constCheckFunction: (fit) -> (TRUE/FALSE) to check if a fit is essentially constant 4. a current fit $currentFit: must contain $mu1 (first posterior moments), $mu2 (second posterior moments), and $KL_div (KL-divergence from q to prior) (can be NULL, at least to start) 5. a predictor object $X (whatever the $fitFunction and $predFunction take in), used for training (can be NULL, e.g. if using constLearner) 6. a predictor object $X_test (whatever the $fitFunction and $predFunction take in), used for testing (can be NULL) 7. a string $growMode for if the learner should be grown (or not) If "+*", we grow mu_0 -> (mu_0 * mu_2) + mu_1 If "+", we grow mu_0 -> (mu_0 + mu_1) If "*", we grow mu_0 -> (mu_0 * mu_1) (NOTE: Not recommended if we start with k = 1) If "NA", we do not grow this learner 8. a logical $changeToConstant for if the learner should be changed to constant if constCheckFunction evaluates to TRUE

Y

is a numeric response. For all but the 'aft.loglogistic' family, this should be an n-vector. For the 'aft.loglogistic' family, this should be an n x 2 matrix, with the first column being the left end-point of survival time and the second column being the right end-point of survival time (used for interval-censored observations).. In the case of left-censored data, the left end-point should be 'NA', and in the case of right-censored data, the right end-point should be 'NA'. If the observation is uncensored, both end-points should be equal to the observed survival time.

k

is an integer, or a vector of integers of length length(learners), for how many terms are in the sum of nodes (for each learner)

d

is either an integer, or an integer vector of length k, or a list of integer vectors of length length(learners) (each element either an integer, or a vector of length k) for the multiplicative depth of each of the k terms NOTE: This can be dangerous. For example, if the fit starts out too large, then entire branhces will be fit to be exactly zero. When this happens, we end up dividing by 0 in places, and this results in NAs, -Inf, etc. USE AT YOUR OWN RISK

sigma2

is a scalar/n-vector specifying a fixed residual variance. If not NULL, then the residual variance will be fixed to this value/vector. If NULL, then it will be initialized and updated automatically. This should be left as NULL unless you really know what you're doing. For safety, this can only be not NULL if family is gaussian

family

is what family the response is

weights

is a vector of the same length as Y weighting the observations in the log-likelihood

scaleWeights

is a logical for if the weights should be scaled to have mean 1 (recommended). If you choose to not scale the weights, then the relative importance of the KL-divergence term will change (possibly desirable, i.e. increase or decrease shrinkage towards the prior)

exposure

is a scalar or a vector used for the Poisson regression, NB, and AFT cases. For Poisson, we assume that the response

Y_i \sim Pois(c_i \lambda_i)

for given exposure

c_i

, and we model

\lambda_i

For AFT, exposure is 1 for non-censored observations, and 0 for right-censored observations

tol

is a positive scalar specifying the level of convergence to be used

verbose

is a logical flag specifying whether we should report convergence information as we go

maxit

is the maximum number of iterations for each version of the VEB-Boost tree

backfit

is a logical. If TRUE, then after the algorithm is done, it'll run through once more with the current tree and fit to convergence. Useful when, e.g. maxit = 1.

Details

Given a pre-specified arithmetic tree structure

T(\mu_1, \dots, \mu_L)

,

\mu_l := h_l(\beta_l)

, priors

\beta_l \sim g_l(\cdot)

, and inputs for the response, VEB-Boosting is performed.

A cyclic CAVI scheme is used, where we cycle over the leaf nodes and update the approxiomation to the posterior distribution at each node in turn.

We start with the arithmetic tree structure

T(\mu_1, \dots, \mu_L) = \sum_{i=1}^k \prod_{j=1}^{d_k} \mu_{i, j}

Value

A VEB_Boost_Node object with the fit


stephenslab/VEB.Boost documentation built on July 2, 2023, 1 p.m.