shortstacking | R Documentation |
Predictions using short-stacking.
shortstacking(
y,
X,
Z = NULL,
learners,
sample_folds = 2,
ensemble_type = "average",
custom_ensemble_weights = NULL,
compute_insample_predictions = FALSE,
subsamples = NULL,
silent = FALSE,
progress = NULL,
auxiliary_X = NULL,
shortstack_y = y
)
y |
The outcome variable. |
X |
A (sparse) matrix of predictive variables. |
Z |
Optional additional (sparse) matrix of predictive variables. |
learners |
May take one of two forms, depending on whether a single
learner or stacking with multiple learners is used for estimation of the
predictor.
If a single learner is used,
If stacking with multiple learners is used,
Omission of the |
sample_folds |
Number of cross-fitting folds. |
ensemble_type |
Ensemble method to combine base learners into final estimate of the conditional expectation functions. Possible values are:
Multiple ensemble types may be passed as a vector of strings. |
custom_ensemble_weights |
A numerical matrix with user-specified
ensemble weights. Each column corresponds to a custom ensemble
specification, each row corresponds to a base learner in |
compute_insample_predictions |
Indicator equal to 1 if in-sample predictions should also be computed. |
subsamples |
List of vectors with sample indices for cross-fitting. |
silent |
Boolean to silence estimation updates. |
progress |
String to print before learner and cv fold progress. |
auxiliary_X |
An optional list of matrices of length
|
shortstack_y |
Optional vector of the outcome variable to form
short-stacking predictions for. Base learners are always trained on
|
shortstack
returns a list containing the following components:
oos_fitted
A matrix of out-of-sample predictions, each column corresponding to an ensemble type (in chronological order).
weights
An array, providing the weight assigned to each base learner (in chronological order) by the ensemble procedures.
is_fitted
When compute_insample_predictions = T
.
a list of matrices with in-sample predictions by sample fold.
auxiliary_fitted
When auxiliary_X
is not
NULL
, a list of matrices with additional predictions.
oos_fitted_bylearner
A matrix of out-of-sample predictions, each column corresponding to a base learner (in chronological order).
is_fitted_bylearner
When
compute_insample_predictions = T
, a list of matrices with
in-sample predictions by sample fold.
auxiliary_fitted_bylearner
When auxiliary_X
is
not NULL
, a
list of matrices with additional predictions for each learner.
Note that unlike crosspred
, shortstack
always computes
out-of-sample predictions for each base learner (at no additional
computational cost).
Ahrens A, Hansen C B, Schaffer M E, Wiemann T (2023). "ddml: Double/debiased machine learning in Stata." https://arxiv.org/abs/2301.09397
Wolpert D H (1992). "Stacked generalization." Neural Networks, 5(2), 241-259.
Other utilities:
crosspred()
,
crossval()
# Construct variables from the included Angrist & Evans (1998) data
y = AE98[, "worked"]
X = AE98[, c("morekids", "age","agefst","black","hisp","othrace","educ")]
# Compute predictions using shortstacking with base learners ols and lasso.
# Two stacking approaches are simultaneously computed: Equally
# weighted (ensemble_type = "average") and MSPE-minimizing with weights
# in the unit simplex (ensemble_type = "nnls1"). Predictions for each
# learner are also calculated.
shortstack_res <- shortstacking(y, X,
learners = list(list(fun = ols),
list(fun = mdl_glmnet)),
ensemble_type = c("average",
"nnls1",
"singlebest"),
sample_folds = 2,
silent = TRUE)
dim(shortstack_res$oos_fitted) # = length(y) by length(ensemble_type)
dim(shortstack_res$oos_fitted_bylearner) # = length(y) by length(learners)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.