predict.mgm: Compute predictions from mgm model objects

View source: R/predict.mgm.R

predict.mgmR Documentation

Compute predictions from mgm model objects

Description

Computes predictions and prediction errors from a mgm model-object (mgm, mvar, tvmgm or tvmvar).

Usage

## S3 method for class 'mgm'
predict(object, data, errorCon, errorCat, 
        tvMethod, consec, beepvar, dayvar, errordecimals=3, 
        ...)

Arguments

object

An mgm model object (the output of one of the functions mgm(), mvar(), tvmgm() or tvmvar())

data

A n x p data matrix with the same structure (number of variables p and types of variables) as the data used to fit the model.

errorCon

Either a character vector specifying the types of nodewise errors that should be computed, where the two provided error functions for continuous varaibles are errorCon = "RMSE", the Root Mean Squared Error, and errorCon = "R2", the proportion of explained variance. The default is errorCon = c("RMSE" "R2").

Alternatively, errorCon can be a list, where each list entry is a custom error function of the form foo(true, pred), where true and pred are the arguments for the vectors of true and predicted values, respectively. If predictions are made for a time-varying model and tvMethod = "weighted", the weighted R2 or RMSE are computed. If a custom function is used, an additional argument for the weights has to be provided: foo(true, pred, weights). Note that custom error functios can also be combined with the buildt-in functions, i.e. errorCon = list("RMSE", "CustomError"=foo).

errorCat

Either a character vector specifying the types of nodewise errors that should be computed, where the two provided error functions for categorical variables are errorCat = "CC", the proportion of correct classification (accuracy) and errorCat = "nCC", the proportion of correct classification normalized by the marginal distribution of the variable at hand. Specifically, nCC = (CC - norm_constant) / (1 - norm_constant), where norm_constant is the highest relative frequency across categories. Another provided error is "CCmarg" which returns the accuracy of the intercept/marginal model. The default is to return all types of errors errorCon = c("CC" "nCC", "CCmarg").

Alternatively, errorCat can be a list, where each list entry is a custom error function of the form foo(true, pred), where true and pred are the arguments for the vectors of true and predicted values, respectively. If predictions are made for a time-varying model and tvMethod = "weighted", the weighted R2 or RMSE are computed. If a custom function is used, an additional argument for the weights has to be provided: foo(true, pred, weights). Note that custom error functios can also be combined with the buildt-in functions, i.e. errorCon = list("nCC", "CustomError"=foo).

tvMethod

Specifies how predictions and errors are computed for time-varying models: tvMethod = "weighted" computes errors by computing a weighted error over all cases in the time series at each estimation point specified in estpoints in tvmgm() or tvmvar(). The weighting corresponds to the weighting used for estimation (see ?tvmgm or ?tvmvar). tvMethod = "closestModel" determines for each time point the closest model and uses that model for prediction. See Details below for a more detailed explanation.

consec

Only relevant for (time-varying) mVAR models. An integer vector of length nrow(data), indicating the sequence of measurement points in a time series. This is only relevant for mVAR models and time series with unequal time intervals. Defaults to consec = NULL, which assumes equal time intervals. consec is ignored if a mgm or tvmgm object is provided to predict.mgm(). For details see ?mvar.

beepvar

Together with the argument dayvar, this argument is an alternative to the consec argument (see above) to specify the consecutiveness of measurements. This is tailored to ecological momentary assessment (EMA) studies, where the consectutiveness is defined by the number of notification on a given day (beepvar) and the given day (dayvar).

dayvar

See beepvar.

errordecimals

Number of decimals to which predictability / prediction error values are rounded. Defaults to errordecimals = 3.

...

Additional arguments.

Details

Nodewise errors in time-varying models can be computed in two different ways: first, one computes the predicted value for each of the N cases in the time series for all models (estimated at different estimation points, see ?tvmgm or ?tvmvar). Then the error of each of the N cases for each of the models is weighted by the weight that has been used to estimate a given model at its estimation point. This means that the error of a data point close to the end of a time-series gets a high weight for models estimated in the end of the time-series and a low weight for models estimated in the beginning of the time series.

Second, we determine for each case in the time-series the closest estmation point, and use the model estimated at that estimation point to make predictions for that case.

Note that the error function normalized accuracy (nCC) is negative if the full model performs worse than the intercept model. This can happen if the model overfits the data.

Value

A list with the following entries:

call

Contains all provided input arguments.

predicted

A n x p matrix with predicted values, matching the dimension of the true values in true.

probabilities

A list with p entries corresponding to p nodes in the data. If a variable is categorical, the corresponding entry contains a n x k matrix with predicted probabilities, where k is the number of categories of the categorical variable. If a variable is continuous, the corresponding entry is empty.

true

Contains the true values. For mgm and tvmgm objects these are equal to the data provided via data. For mvar and tvmvar objects, these are equal to the rows that can be predicted in a VAR model, depending on the largest specified lag and (if specified) the consec argument.

errors

A matrix containing the all types of errors specified via errorCon and errorCat, for each variable. If tvMethod = "weighted", the matrix becomes an array, with an additional dimension for the estimation point.

tverrors

If tvMethod = "weighted", this list entry contains a list with errors of the format of errors, separately for each estimation point. The errors are computed from predictions of the model at the given estimation points and weighted by the weight-vector at that estimation point. If tvMethod = "closestModel", this entry is empty.

Author(s)

Jonas Haslbeck <jonashaslbeck@gmail.com>

References

Haslbeck, J. M. B., & Waldorp, L. J. (2020). mgm: Estimating time-varying Mixed Graphical Models in high-dimensional Data. Journal of Statistical Software, 93(8), pp. 1-46. DOI: 10.18637/jss.v093.i08

Examples


## Not run: 
# See examples in ?mgm, ?tvmgm, ?mvar and ?tvmvar.

## End(Not run)


jmbh/mgm documentation built on Nov. 17, 2023, 9:20 a.m.