| shortstacking | R Documentation |
Predictions using short-stacking.
shortstacking(
y,
X,
Z = NULL,
learners,
sample_folds = 2,
ensemble_type = "average",
custom_ensemble_weights = NULL,
compute_insample_predictions = FALSE,
subsamples = NULL,
silent = FALSE,
progress = NULL,
auxiliary_X = NULL,
shortstack_y = y
)
y |
The outcome variable. |
X |
A (sparse) matrix of predictive variables. |
Z |
Optional additional (sparse) matrix of predictive variables. |
learners |
May take one of two forms, depending on whether a single
learner or stacking with multiple learners is used for estimation of the
predictor.
If a single learner is used,
If stacking with multiple learners is used,
Omission of the |
sample_folds |
Number of cross-fitting folds. |
ensemble_type |
Ensemble method to combine base learners into final estimate of the conditional expectation functions. Possible values are:
Multiple ensemble types may be passed as a vector of strings. |
custom_ensemble_weights |
A numerical matrix with user-specified
ensemble weights. Each column corresponds to a custom ensemble
specification, each row corresponds to a base learner in |
compute_insample_predictions |
Indicator equal to 1 if in-sample predictions should also be computed. |
subsamples |
List of vectors with sample indices for cross-fitting. |
silent |
Boolean to silence estimation updates. |
progress |
String to print before learner and cv fold progress. |
auxiliary_X |
An optional list of matrices of length
|
shortstack_y |
Optional vector of the outcome variable to form
short-stacking predictions for. Base learners are always trained on
|
shortstack returns a list containing the following components:
oos_fittedA matrix of out-of-sample predictions, each column corresponding to an ensemble type (in chronological order).
weightsAn array, providing the weight assigned to each base learner (in chronological order) by the ensemble procedures.
is_fittedWhen compute_insample_predictions = T.
a list of matrices with in-sample predictions by sample fold.
auxiliary_fittedWhen auxiliary_X is not
NULL, a list of matrices with additional predictions.
oos_fitted_bylearnerA matrix of out-of-sample predictions, each column corresponding to a base learner (in chronological order).
is_fitted_bylearnerWhen
compute_insample_predictions = T, a list of matrices with
in-sample predictions by sample fold.
auxiliary_fitted_bylearnerWhen auxiliary_X is
not NULL, a
list of matrices with additional predictions for each learner.
Note that unlike crosspred, shortstack always computes
out-of-sample predictions for each base learner (at no additional
computational cost).
Ahrens A, Hansen C B, Schaffer M E, Wiemann T (2023). "ddml: Double/debiased machine learning in Stata." https://arxiv.org/abs/2301.09397
Wolpert D H (1992). "Stacked generalization." Neural Networks, 5(2), 241-259.
Other utilities:
crosspred(),
crossval()
# Construct variables from the included Angrist & Evans (1998) data
y = AE98[, "worked"]
X = AE98[, c("morekids", "age","agefst","black","hisp","othrace","educ")]
# Compute predictions using shortstacking with base learners ols and lasso.
# Two stacking approaches are simultaneously computed: Equally
# weighted (ensemble_type = "average") and MSPE-minimizing with weights
# in the unit simplex (ensemble_type = "nnls1"). Predictions for each
# learner are also calculated.
shortstack_res <- shortstacking(y, X,
learners = list(list(fun = ols),
list(fun = mdl_glmnet)),
ensemble_type = c("average",
"nnls1",
"singlebest"),
sample_folds = 2,
silent = TRUE)
dim(shortstack_res$oos_fitted) # = length(y) by length(ensemble_type)
dim(shortstack_res$oos_fitted_bylearner) # = length(y) by length(learners)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.