Evaluates a data stack by fitting a regularized model on the assessment predictions from each candidate member to predict the true outcome.
This process determines the "stacking coefficients" of the model stack. The stacking coefficients are used to weight the predictions from each candidate (represented by a unique column in the data stack), and are given by the betas of a LASSO model fitting the true outcome with the predictions given in the remaining columns of the data stack.
Candidates with non-zero stacking coefficients are model stack
members, and need to be trained on the full training set (rather
than just the assessment set) with
fit_members(). This function
is typically used after a number of calls to
1 2 3 4 5 6 7 8 9
A numeric vector of proposed values for total amount of
regularization used in member weighting. Higher penalties will generally
result in fewer members being included in the resulting model stack, and
vice versa. The package will tune over a grid formed from the cross
product of the
A number between zero and one (inclusive) giving the
proportion of L1 regularization (i.e. lasso) in the model.
A logical giving whether to restrict stacking
coefficients to non-negative values. If
A call to
An object inheriting from
Additional arguments. Currently ignored.
Note that a regularized linear model is one of many possible
learning algorithms that could be used to fit a stacked ensemble
model. For implementations of additional ensemble learning algorithms, see
model_stacks largely contain the
same elements as
data_stacks, the primary data objects shift from the
assessment set predictions to the member models.
This package provides some resampling objects and datasets for use in examples and vignettes derived from a study on 1212 red-eyed tree frog embryos!
Red-eyed tree frog (RETF) embryos can hatch earlier than their normal 7ish days if they detect potential predator threat. Researchers wanted to determine how, and when, these tree frog embryos were able to detect stimulus from their environment. To do so, they subjected the embryos at varying developmental stages to "predator stimulus" by jiggling the embryos with a blunt probe. Beforehand, though some of the embryos were treated with gentamicin, a compound that knocks out their lateral line (a sensory organ.) Researcher Julie Jung and her crew found that these factors inform whether an embryo hatches prematurely or not!
Note that the data included with the stacks package is not necessarily a representative or unbiased subset of the complete dataset, and is only for demonstrative purposes.
rset cross-fold validation objects
rsample, splitting the training data into for the regression
and classification model objects, respectively.
tree_frogs_class_test are the analogous testing sets.
reg_res_sp contain regression tuning results
for a linear regression, support vector machine, and spline model, respectively,
latency (i.e. how long the embryos took to hatch in response
to the jiggle) in the
tree_frogs data, using most all of the other
variables as predictors. Note that the data underlying these models is
filtered to include data only from embryos that hatched in response to
class_res_nn contain multiclass classification tuning
results for a random forest and neural network classification model,
reflex (a measure of ear function) in the
data using most all of the other variables as predictors.
log_res_nn, contain binary classification tuning results
for a random forest and neural network classification model, respectively,
hatched (whether or not the embryos hatched in response
to the stimulus) using most all of the other variables as predictors.
?example_data to learn more about these objects, as well as browse
the source code that generated them.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
# see the "Example Data" section above for # clarification on the objects used in these examples! # put together a data stack reg_st <- stacks() %>% add_candidates(reg_res_lr) %>% add_candidates(reg_res_svm) %>% add_candidates(reg_res_sp) reg_st # evaluate the data stack reg_st %>% blend_predictions() # include fewer models by proposing higher penalties reg_st %>% blend_predictions(penalty = c(.5, 1)) # allow for negative stacking coefficients # with the non_negative argument reg_st %>% blend_predictions(non_negative = FALSE) # use a custom metric in tuning the lasso penalty library(yardstick) reg_st %>% blend_predictions(metric = metric_set(rmse)) # pass control options for stack blending reg_st %>% blend_predictions( control = tune::control_grid(allow_par = TRUE) ) # the process looks the same with # multinomial classification models class_st <- stacks() %>% add_candidates(class_res_nn) %>% add_candidates(class_res_rf) %>% blend_predictions() class_st # ...or binomial classification models log_st <- stacks() %>% add_candidates(log_res_nn) %>% add_candidates(log_res_rf) %>% blend_predictions() log_st
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.