veb_boost_stumps | R Documentation |
Wrapper for using VEB-Boost with the SER prior w/ stumps and/or linear terms
veb_boost_stumps(
X,
Y,
X_test = NULL,
learners = NULL,
include_linear = NULL,
include_stumps = NULL,
num_cuts = ceiling(min(NROW(Y)/5, max(100, sqrt(NROW(Y))))),
k = 1,
d = 1,
use_quants = TRUE,
scale_X = c("sd", "max", "NA"),
growMode = c("+*", "+", "*", "NA"),
changeToConstant = FALSE,
max_log_prior_var = 0,
use_optim = TRUE,
lin_prior_prob = 0.5,
reverse_learners = FALSE,
nthreads = ceiling(parallel::detectCores(logical = TRUE)/2),
...
)
X |
An (n x p) numeric matrix to be used as the predictors (currently, this wrapper forces all nodes to use the same X) |
Y |
is a numeric vector response |
X_test |
is an optional (m X p) matrix to be used as the testing data. Posterior mean response is saved in the output's field |
learners |
is a list of other learners to be used in |
include_linear |
is a logical of length 1 or p specifying which columns of X we should include as linear terms. If the length is 1, this value gets recycled for all columns of X. If NULL is supplied, then all valid linear terms are used. |
include_stumps |
is a logical of length 1 or p specifying which columns of X we should include as stump terms If the length is 1, this value gets recycled for all columns of X. If NULL is supplied, then all valid stumps terms are used. |
num_cuts |
is a whole number of length 1 or p specifying how many cuts to make when making the stumps terms.
If the length is 1, this value gets recycled for all columns of X.
For entries corresponding to the indices where |
k |
is an integer, or a vector of integers of length |
d |
is either an integer, or an integer vector of length |
use_quants |
is a logical for if the cut-points should be based off of the quantiles ('use_quants = TRUE'), or if the cut points should be evenly spaced in the range of the variable ('use_quants = FALSE'). |
scale_X |
is a string for if/how the columns of X should be scaled. 'sd' scales by the standard deviations of the variables. 'max' scales by the maximum absolute value (so variables are on the [-1, +1] scale). 'NA' performs no scaling. |
growMode |
is a string for if the learner should be grown (or not)
If |
changeToConstant |
is a logical for if constant fits should be changed to be constant |
max_log_prior_var |
is a scalar for the maximum that the estimated log-prior variance for each weak learner can be. The idea is that setting this to be small limits the "size" of each weak learner, similar-ish to the learning rate in boosting. The maximum allowed value is 35, which essentially allows the unrestricted MLE to be estimated. The minimum allowed value is -5. |
use_optim |
is a logical. If TRUE, then the prior variance is optimized using the Brent method. If FALSE, then a single EM step is taken to optimize over V |
lin_prior_prob |
is a number between 0 and 1 that gives the prior probability that the effect variable is a linear term. This means that (1 - lin_prior_prob) is the probability that the effect variable is a stump term. Within linear terms and stump terms, all variables have the same prior variance. |
reverse_learners |
is a logical for if the order of learners should be reversed. If FALSE, all additional learners in 'learners' will will come first. If TRUE, the stumps learners will be first. |
... |
Other arguments to be passed to |
This function performs VEB-Boosting, where the prior to be used is the SER prior, and our predictors are either i) the linear terms of X; ii) the stumps made from the columns of X; or iii) Both (i) and (ii)
A VEB_Boost_Node
object with the fit
set.seed(1)
n = 1000
p = 1000
X = matrix(runif(n * p), nrow = n, ncol = p)
Y = rnorm(n, 5*sin(3*X[, 1]) + 2*(X[, 2]^2) + 3*X[, 3]*X[, 4])
veb.stumps.fit = veb_boost_stumps(X, Y, include_linear = TRUE, family = "gaussian")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.