extend_env_future: Extend environmental data into the future.

Description Usage Arguments Value

View source: R/forecasting_helpers.R

Description

Extend environmental data into the future.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
extend_env_future(
  env_data,
  quo_groupfield,
  quo_obsfield,
  quo_valuefield,
  env_ref_data,
  env_info,
  fc_model_family,
  epi_date_type,
  valid_run,
  groupings,
  env_variables_used,
  report_dates
)

Arguments

env_data

Daily environmental data for the same groupfields and date range as the epidemiological data. It may contain extra data (other districts or date ranges). The data must be in long format (one row for each date and environmental variable combination), and must start at absolutel minimum report_settings$env_lag_length days (default 180) before epi_data for forecasting.

quo_groupfield

Quosure of the user given geographic grouping field to run_epidemia().

quo_obsfield

Quosure of user given field name of the environmental data variables.

quo_valuefield

Quosure of user given field name of the value of the environmental data variable observations.

env_ref_data

Historical averages by week of year for environmental variables. Used in extended environmental data into the future for long forecast time, to calculate anomalies in early detection period, and to display on timeseries in reports.

env_info

Lookup table for environmental data - reference creation method (e.g. sum or mean), report labels, etc.

fc_model_family

The family parameter passsed to mgcv::bam, and the extended families in family.mgcv can also be used. This sets the type of generalized additive model (GAM) to run: it specifies the distribution and link to use in model fitting. E.g. for a Poisson regression, the user would input "poisson()". If a cached model is being used, set the parameter to '"cached"'.

epi_date_type

Extract from 'report_settings$epi_date_type'

valid_run

Internal TRUE/FALSE for whether this is part of a validation run.

groupings

A unique list of the geographic groupings (from groupfield).

env_variables_used

List of environmental variables that were used in the modeling (in 'report_settings$env_var' & found in env_data and env_info)

report_dates

Internally generated set of report date information: min, max, list of dates for full report, known epidemiological data period, forecast period, and early detection period.

Value

Environmental dataset, with data extended into the future forecast period. Unknown environmental data with runs of < 2 weeks is filled in with last known data (i.e. "persistence" method, using the mean of the previous week of known data). For missing data runs more than 2 weeks, the values are filled in using a progressive blend of the the mean of the last known week and the historical means.


EcoGRAPH/epidemiar documentation built on Nov. 13, 2020, 5:31 p.m.