predict_aarr: Use annual average rate of reduction (AARR) to predict...
In caldwellst/augury: Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics

predict_aarr

R Documentation

Use annual average rate of reduction (AARR) to predict prevalence

Description

predict_aarr() is a specific function designed to use annual average rate of reduction (AARR) of prevalence data to forecast future prevalence. This is particularly useful for forecasting future prevalence when there is not a full time series available, but only a few data points for each series.

Usage

predict_aarr(
  df,
  response,
  sort_col_min = NULL,
  interpolate = FALSE,
  ret = c("df", "all", "error", "model"),
  scale = NULL,
  probit = FALSE,
  test_col = NULL,
  test_period = NULL,
  test_period_flex = NULL,
  group_col = "iso3",
  group_models = TRUE,
  obs_filter = NULL,
  sort_col = "year",
  sort_descending = FALSE,
  pred_col = "pred",
  type_col = NULL,
  types = c("imputed", "projected"),
  source_col = NULL,
  source = NULL,
  scenario_detail_col = NULL,
  scenario_detail = NULL,
  replace_obs = c("missing", "all", "none")
)

Arguments

`df`	Data frame of model data.
`response`	Column name of prevalence variable to be used to calculate AARR.
`sort_col_min`	If provided, a numeric value that sets a minimum value needed to be met in the `sort_col` for an observation to be used in calculating AARR. If `sort_col = "year"` and `sort_col_min = 2008`, then only observations from 2008 onward will be used in calculating AARR.
`interpolate`	Logical value, whether or not to interpolate values based on estimated AARR between observations. Defaults to `FALSE`.
`ret`	Character vector specifying what values the function returns. Defaults to returning a data frame, but can return a vector of model error, the model itself or a list with all 3 as components.
`scale`	Either `NULL` or a numeric value. If a numeric value is provided, the response variable is scaled by the value passed to scale prior to model fitting and prior to any probit transformation, so can be used to put the response onto a 0 to 1 scale. Scaling is done by dividing the response by the scale and using the `scale_transform()` function. The response, as well as the fitted values and confidence bounds are unscaled prior to error calculation and returning to the user.
`probit`	Logical value on whether or not to probit transform the response prior to model fitting. Probit transformation is performed after any scaling determined by `scale` but prior to model fitting. The response, as well as the fitted values and confidence bounds are untransformed prior to error calculation and returning to the user.
`test_col`	Name of logical column specifying which response values to remove for testing the model's predictive accuracy. If `NULL`, ignored. See `model_error()` for details on the methods and metrics returned.
`test_period`	Length of period to test for RMChE. If `NULL`, beginning and end points of each group in `group_col` are compared. Otherwise, `test_period` must be set to an integer `n` and for each group, comparisons are made between the end point and `n` periods prior.
`test_period_flex`	Logical value indicating if `test_period` is less than the full length of the series, should change error still be calculated for that point. Defaults to `FALSE`.
`group_col`	Column name(s) of group(s) to use in `dplyr::group_by()` when supplying type, calculating mean absolute scaled error on data involving time series, and if `group_models`, then fitting and predicting models too. If `NULL`, not used. Defaults to `"iso3"`.
`group_models`	Logical, if `TRUE`, fits and predicts models individually onto each `group_col`. If `FALSE`, a general model is fit across the entire data frame.
`obs_filter`	String value of the form "`⁠logical operator⁠` `integer`" that specifies the number of observations required to fit the model and replace observations with predicted values. This is done in conjunction with `group_col`. So, if `group_col = "iso3"` and `obs_filter = ">= 5"`, then for this model, predictions will only be used for `iso3` vales that have 5 or more observations. Possible logical operators to use are `>`, `>=`, `<`, `<=`, `==`, and `!=`. If `group_models = FALSE`, then `obs_filter` is only used to determine when predicted values replace observed values but is not used to restrict values from being used in model fitting. If `group_models = TRUE`, then a model is only fit for a group if they meet the `obs_filter` requirements. This provides speed benefits, particularly when running INLA time series using `predict_inla()`.
`sort_col`	Column name of column to arrange data by in `dplyr::arrange()`, prior to filtering for latest contiguous time series and producing the forecast. Not used if `NULL`, defaults to `"year"`.
`sort_descending`	Logical value on whether the sorted values from `sort_col` should be sorted in descending order. Defaults to `FALSE`.
`pred_col`	Column name to store predicted value.
`type_col`	Column name specifying data type.
`types`	Types to add to missing values. The first value is for imputed values and the second is for extrapolated values.
`source_col`	Column name containing source information for the data frame. If provided, the argument in `source` is used to fill in where predictions have filled in missing data.
`source`	Source to add to missing values.
`scenario_detail_col`	Column name containing scenario_detail information for the data frame. If provided, the argument in `scenario_detail` is used to fill in where prediction shave filled in missing data.
`scenario_detail`	Scenario details to add to missing values (usually the name of the model being used to generate the projection, optionally with relevant parameters).
`replace_obs`	Character value specifying how, if at all, observations should be replaced by fitted values. Defaults to replacing only missing values, but can be used to replace all values or none.

Details

This function, in its current form, only forecast data from its last observed data point, as AARR is not ideal for interpolation. In this case, the model being returned by the function is a dataset of AARR values for each group (or a single value if no grouped variables). No confidence bounds are generated by predict_aarr().

Value

Depending on the value passed to ret, either a data frame with predicted data, a vector of errors from model_error(), a fitted model, or a list with all 3.

caldwellst/augury documentation built on Oct. 10, 2024, 8:20 a.m.

caldwellst/augury index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

caldwellst/augury
Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics

predict_aarr: Use annual average rate of reduction (AARR) to predict...
In caldwellst/augury: Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics

Use annual average rate of reduction (AARR) to predict prevalence

Description

Usage

Arguments

Details

Value

Related to predict_aarr in caldwellst/augury...

R Package Documentation

Browse R Packages

We want your feedback!

caldwellst/augury Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics

predict_aarr: Use annual average rate of reduction (AARR) to predict... In caldwellst/augury: Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics

Use annual average rate of reduction (AARR) to predict prevalence

Description

Usage

Arguments

Details

Value

Related to predict_aarr in caldwellst/augury...

R Package Documentation

Browse R Packages

We want your feedback!

caldwellst/augury
Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics

predict_aarr: Use annual average rate of reduction (AARR) to predict...
In caldwellst/augury: Provides Streamlined Methods for Data Imputation and Forecasting for WHO DDI Statistics