input_check: Functions to check input to epidemiar

Description Usage Arguments Value

View source: R/input_checks.R


Function does basic existance checks and variety of logic checks on input data to run_epidemia().


input_check(epi_data, quo_casefield, quo_popfield, inc_per, quo_groupfield,
  week_type, report_period, ed_summary_period, ed_method, ed_control,
  env_data, quo_obsfield, quo_valuefield, forecast_future, fc_control,
  env_ref_data, env_info, model_obj, model_cached, model_choice)



Epidemiological data with case numbers per week, with date field "obs_date".


Quosure of user given field containing the disease case counts.


Quosure of user-given field containing population values.


Number for what unit of population the incidence should be reported in, e.g. incidence rate of 3 per 1000 people.


Quosure of the user given geographic grouping field to run_epidemia().


String indicating the standard (WHO ISO-8601 or CDC epi weeks) that the weeks of the year in epidemiological and environmental reference data use ["ISO" or "CDC"]. (Required: epidemiological observation dates listed are LAST day of week).


The number of weeks that the entire report will cover. The report_period minus forecast_future is the number of weeks of past (known) data that will be included.


The number of weeks that will be considered the "early detection period". It will count back from the week of last known epidemiological data.


Which method for early detection should be used ("Farrington" is only current option, or "None").


All parameters for early detection algorithm, passed through to that subroutine.


Daily environmental data for the same groupfields and date range as the epidemiological data. It may contain extra data (other districts or date ranges). The data must be in long format (one row for each date and environmental variable combination), and must start at absolutel minimum laglen (in fc_control) days before epi_data for forecasting.


Quosure of user given field name of the environmental data variables


Quosure of user given field name of the value of the environmental data variable observations.


Number of futre weeks from the end of the epi_data to produce forecasts.


Parameters for forecasting, including which environmental variable to include and any geographic clusters.


Historical averages by week of year for environmental variables. Used in extended environmental data into the future for long forecast time, to calculate anomalies in early detection period, and to display on timeseries in reports.


Lookup table for environmental data - reference creation method (e.g. sum or mean), report labels, etc.


Deprecated, use model_cached.


The output of a previous model_run = TRUE run of run_epidemia() that produces a model (regression object) and metadata. The metadata will be used for input checking and validation. Using a prebuilt model saves on processing time, but will need to be updated periodically.


Critical argument to choose the type of model to generate. The options are versions that the EPIDEMIA team has used for forecasting. The first supported options is "poisson-gam" ("p") which is the original epidemiar model: a Poisson regression using bam (for large data GAMs), with a smoothed cyclical for seasonality. The default for fc_control$anom_env is TRUE for using the anomalies of environmental variables rather than their raw values. The second option is "negbin" ("n") which is a negative binomial regression using glm, with no external seasonality terms - letting the natural cyclical behavior of the environmental variables fill that role. The default for fc_control$anom_env is FALSE and uses the actual observation values in the modeling. The fc_control$anom_env can be overruled by the user providing a value, but this is not recommended unless you are doing comparisons.


Returns a flag if there were any errors, plus accompanying error messages. Also returns a flag and messages for warnings, as well.

EcoGRAPH/epidemiar documentation built on Aug. 22, 2019, 6:53 a.m.