merge_average_df: Merge average df with predictions with original data frame

View source: R/utils_average_trend.R

merge_average_dfR Documentation

Merge average df with predictions with original data frame

Description

Merge average df with predictions with original data frame

Usage

merge_average_df(
  avg_df,
  df,
  response,
  average_cols,
  group_col,
  obs_filter,
  sort_col,
  pred_col,
  pred_upper_col,
  pred_lower_col,
  test_col
)

Arguments

avg_df

Data frame with average trends.

df

Data frame of model data.

response

Column name of response variable.

average_cols

Column name(s) of column(s) for use in grouping data for averaging, such as regions. If missing, uses global average of the data for infilling.

group_col

Column name(s) of group(s) to use in dplyr::group_by() when supplying type, calculating mean absolute scaled error on data involving time series, and if group_models, then fitting and predicting models too. If NULL, not used. Defaults to "iso3".

obs_filter

String value of the form "⁠logical operator⁠ integer" that specifies the number of observations required to fit the model and replace observations with predicted values. This is done in conjunction with group_col. So, if group_col = "iso3" and obs_filter = ">= 5", then for this model, predictions will only be used for iso3 vales that have 5 or more observations. Possible logical operators to use are >, >=, <, <=, ==, and !=.

If `group_models = FALSE`, then `obs_filter` is only used to determine when
predicted values replace observed values but **is not** used to restrict values
from being used in model fitting. If `group_models = TRUE`, then a model
is only fit for a group if they meet the `obs_filter` requirements. This provides
speed benefits, particularly when running INLA time series using `predict_inla()`.
sort_col

Column name(s) to use to dplyr::arrange() the data prior to supplying type and calculating mean absolute scaled error on data involving time series. If NULL, not used. Defaults to "year".

pred_col

Column name to store predicted value.

pred_upper_col

Column name to store upper bound of confidence interval generated by the predict_... function. This stores the full set of generated values for the upper bound.

pred_lower_col

Column name to store lower bound of confidence interval generated by the predict_... function. This stores the full set of generated values for the lower bound.

test_col

Name of logical column specifying which response values to remove for testing the model's predictive accuracy. If NULL, ignored. See model_error() for details on the methods and metrics returned.

Value

Original data frame with new trend joined up.


caldwellst/augury documentation built on Oct. 10, 2024, 8:20 a.m.