model_error: Get modeling error from a data frame
In caldwellst/augury: Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics

model_error

R Documentation

Get modeling error from a data frame

Description

model_error() calculates modeling error using observed and fitted values from the data frame. If test_col is provided, the error is only calculated on observations that were excluded from modeling for test purpose. Otherwise, the error is calculated for all non-missing values.

Usage

model_error(
  df,
  response,
  test_col = NULL,
  test_period = NULL,
  test_period_flex = FALSE,
  group_col = NULL,
  sort_col = NULL,
  sort_descending = FALSE,
  pred_col = "pred",
  pred_upper_col = "pred_upper",
  pred_lower_col = "pred_lower"
)

Arguments

`df`	Data frame of model data.
`response`	Column name of response variable.
`test_col`	Name of logical column specifying which response values to remove for testing the model's predictive accuracy. If `NULL`, ignored. See `model_error()` for details on the methods and metrics returned.
`test_period`	Length of period to test for RMChE. If `NULL`, beginning and end points of each group in `group_col` are compared. Otherwise, `test_period` must be set to an integer `n` and for each group, comparisons are made between the end point and `n` periods prior.
`test_period_flex`	Logical value indicating if `test_period` is less than the full length of the series, should change error still be calculated for that point. Defaults to `FALSE`.
`group_col`	Column name(s) of group(s) to use in `dplyr::group_by()` when supplying type, calculating mean absolute scaled error on data involving time series, and if `group_models`, then fitting and predicting models too. If `NULL`, not used. Defaults to `"iso3"`.
`sort_col`	Column name(s) to use to `dplyr::arrange()` the data prior to supplying type and calculating mean absolute scaled error on data involving time series. If `NULL`, not used. Defaults to `"year"`.
`sort_descending`	Logical value on whether the sorted values from `sort_col` should be sorted in descending order. Defaults to `FALSE`.
`pred_col`	Column name to store predicted value.
`pred_upper_col`	Column name to store upper bound of confidence interval generated by the `predict_...` function. This stores the full set of generated values for the upper bound.
`pred_lower_col`	Column name to store lower bound of confidence interval generated by the `predict_...` function. This stores the full set of generated values for the lower bound.

Details

The error metrics generated from model_error() are the following:

RMSE: root mean squared error
MAE: mean absolute error
MdAE: median absolute error
MASE: mean absolute scaled error. Only calculated if test_col is provided, as it is test error scaled by in-sample error.
CBA: confidence bound accuracy, % of observations lying within the confidence bounds. Should be very near to 95%. Only calculated if both pred_upper_col and pred_lower_col are provided.
R2: R-squared or coefficient of determination. Calculated only on test values if test_col is provided. Due to the variety of models available within augury, as well as the predict_..._avg_trend() functions, adjusted R-squared is not currently available.
COR: Pearson correlation coefficient of fitted values to observations. Useful as a measure of general trend matching beyond the point error measurements used above. If group_col provided, correlation coefficients are calculated within each group and the average across all groups is returned. Calculated on all data, but be careful in interpreting when applied to non-time series data.
RMChE: root mean change error. Since the GPW13 infilling and projections are designed to estimate change over time, RMChE measures the accuracy of this change. It is calculated as the difference between observed change between two time periods and predicted change across those same time periods. If test_period is NULL, this is the beginning and end of each group from group_col, sorted by sort_col. If test_period is provided as an integer n, then instead it is calculated comparing change between the end and n periods prior. test_period_flexibility says whether or not to calculate the change if the full length of the series is less than test_period. If TRUE, then it again compares change between the beginning and end of the series for that group.

Value

A named vector of errors: RMSE, MAE, MdAE, MASE, CBA, R2, COR and RMChE.

caldwellst/augury documentation built on Oct. 10, 2024, 8:20 a.m.

caldwellst/augury index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

caldwellst/augury
Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics

model_error: Get modeling error from a data frame
In caldwellst/augury: Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics

Get modeling error from a data frame

Description

Usage

Arguments

Details

Value

Related to model_error in caldwellst/augury...

R Package Documentation

Browse R Packages

We want your feedback!

caldwellst/augury Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics

model_error: Get modeling error from a data frame In caldwellst/augury: Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics

Get modeling error from a data frame

Description

Usage

Arguments

Details

Value

Related to model_error in caldwellst/augury...

R Package Documentation

Browse R Packages

We want your feedback!

caldwellst/augury
Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics

model_error: Get modeling error from a data frame
In caldwellst/augury: Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics