knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Running Model Validation and Assessment

Validation and assessment have been built into the epidemiar package in the function run_validation() for on-demand evaluation for any historical period.

Evaluation can be made for one through n-week ahead predictions, and include comparisons with two naive models: persistence of last known value, and average cases from that week of the year.

Building validation into the early warning system provides more opportunities to learn about the model via the validation results. Locations where the models perform well and where they do not can be identified with geographical grouping-level results.

With on-demand implementation and time-range flexibility, one can also investigate how accuracy changes over time, which is of particular interest in places like Ethiopia with changing patterns and declining trends due to anti-malarial programs.

Specific Arguments

The run_validation() function takes 5 arguments, plus all the run_epidemia() arguments.

Other Arguments & Adjustments

The run_validation() function will call run_epidemia(), so it will also take all the arguments for that function. The user does not need to modify any of these arguments (e.g. event detection settings, fc_future_period), as run_validation() will automatically handle all of these adjustments for settings that are irrelevant for validation runs.

It is envisioned that users can take their usual script for running EPIDEMIA forecasts, and simply sub in the validation function with those validation settings for doing model assessments.

Validation Output

Statistics

Validation statistics included Mean Squared Error (MSE), Mean Absolute Error (MAE), and R^2^ (R2, variance explained). Where ‘obs’ is atual observed value and ‘pred’ is the predicted forecast values:

Skill scores are calculated per statistic. The forecast accuracy statistic value (score~fc~) is compared against the naive model statistic (per naive model, score~naive~) in regards to a perfect (no error) value (score~perfect~):

The skill metric has an upper bound of 1. Skill of 0 is no improvement of the forecast model over that naive model. Skill between 0 and 1 shows the relative improvement of the forecast model over that naive model. Lower bound of the skill statistic depends on statistic.

Results will be returned summarized at the model level and also at the geographic grouping level.

Format

Results are returned in a list.

  1. skill_scores: The skill score results of the forecast model compared against the naive models, if skill_test = TRUE was selected
  2. skill_overall: The skill scores at the overall model level
  3. skill_grouping: The skill score results per geographic grouping

  4. validations: The validation accuracy statistics per model (name of the base model and the naive models if run with skill test comparison). Each model entry will have three items:

  5. validation_overall: Overall model accuracy statistics per timestep_ahead (week in the future)
  6. validation_grouping: Accuracy statistics per geographic grouping per timestep_ahead
  7. validation_timeseries: In beta-testing, an early version of a rolling validation results over time
  8. validation_perweek: Validation results per week entry (per geographic group per timestep_ahead)

  9. metadata: Metadata on the parameters used to run validation and the date it was run.

Results Display

For a formatted validation report, please look at the accompanying R project epidemiar-demo and the run_validation_amhara.R script in the validation folder, using the epidemia_validation.Rnw formatting script.



EcoGRAPH/epidemiar documentation built on Nov. 13, 2020, 5:31 p.m.