getStackingWeights: Get Ensemble Weights via Stacking

Description Usage Arguments Details Value

Description

Weights used to construct the final ensemble from the individual models are computed via stacking.

Usage

1
getStackingWeights(data, fits, imputationParameters)

Arguments

data

The data object containing the observations to impute.

fits

A list of the fitted values. These may be estimated via leave one out cross-validation or directly.

imputationParameters

A list of the parameters for the imputation algorithms. See defaultImputationParameters() for a starting point.

Details

Traditional stacking proceeds as follows: For each individual byKey (usually country) a set of weights is chosen for the ensemble. The weights are constrained to be positive and to sum to 1, and the final ensemble is constructed by sum(w_i*model_i). The weights should be chosen in a way such that better models are given more importance. Thus, the following criteria is minimized:

sum(errorFunction(|y - w_i*model_i|))

i.e. the errorFunction applied to the difference between the observed values and the ensemble estimate. The errorFunction is typically just x^2, but could be a more complex function.

However, this is roughly equivalent to regression, with a constraint added. In some cases, however, our datasets will be so sparse that we won't be able to perform this optimization (only four observations and 7 valid models, for example, will not have a unique solution). Thus, we instead use a LASSO regression for computing the stacking weights.

Note: if errorType is not "loocv", then stacking will be problematic: much higher weights will be given to flexible models (such as loess or splines) without valid reason.

Value

A data.table containing the weight for each model within each byKey group, as well as a few other (currently unused) statistics.


SWS-Methodology/faoswsImputation documentation built on May 9, 2019, 11:48 a.m.