runDICE: Main Driver for the 'DICE' package

Description Usage Arguments Value Examples

View source: R/dice.R

Description

The main driver for DICE which grabs the data for the requested dataset, season and model/fit spatial regions combinations. After data is retrieved, the simulation is setup and the function calls either the single or multi options - depending on the user's request for an uncoupled or coupled run. In either case the fit begins with an MCMC procedure on the model-level data. In the case of statistical modeling - the code fits the model level data both directly and as a weighted sum of the region level data fits.

Usage

1
2
3
4
5
6
7
runDICE(data_source = NULL, year = 2016, mod_level = 2, fit_level = 3,
  nfit = 52, model = 5, isingle = 0, nMCMC = 1e+05, nreal = 1,
  device = "pdf", prior = 0, Temp = 1, da = 0, mod_name = c(NAME_2 =
  "US"), RegState = NULL, fit_names = "all", subDir = NULL, plot = 1,
  iseed = NULL, Tg = NULL, epi_model = NULL, disease = "flu",
  db_opts = list(DICE_db = "predsci", CDC_server = TRUE),
  arima_model = NULL, method = "mech", covar = FALSE, covar_lag = 1)

Arguments

data_source

Describes the data source for the incidence data. Default is 'cdc' (for disease = 'flu'). It can be selected by source_key (integer) or source abbreviation (string). Most disease/location combinations have only one data source. In this case, it may be easier to set data_source=NULL. However, when multiple data sources exist, setting data_source=NULL will essentially choose from the available sources at random. To determine a data source by graphical interface, see: predsci.com/id_data/. Looking-up the disease and location will result in a list of data sources that can be entered into DICE. Alternatively, all country/disease/data_source combinations are listed in the ‘Data Sources Table’ tab at the same url. To access the list of sources directly from an R-prompt, see the examples below.

year

Integer - start year of the disease season

mod_level

Integer - Spatial level of the model data. For CDC can only be 2, 3 and 4. For Dengue - country dependent

fit_level

Integer - Spatial level of data used for a coupled or uncoupled fit of the model data, fit_level = 2,3,4 for flu

nfit

Integer - Number of data points that will be fitted. Default is to fit all the data. This will be reset if nfit > nperiodsData

model

Integer - The model number, see manual for more details (1-4 are supported for flu 4 for dengue). Relevant only when method = 'mech'

isingle

Integer - 0: couple the fit spatial regions; 1: no coupling. Relevant only when method = 'mech'

nMCMC

Integer - number of steps/trials in the MCMC procedure. Relevant only when method = 'mech'

nreal

Integer - number of MCMC chains. Relevant only when method = 'mech'

device

Either 'pdf' (default) or 'x11'

prior

Integer - if greater than zero use a prior for the MCMC procedure. Relevant only when method = 'mech'

Temp

Integer 1, 5, 10, 100 - Temperature for the MCMC procedure. Relevant only when method = 'mech'

da

Integer 0, 1 or 2. Data augmentation options: 0-none, 1-using historic average and 2-using the most similar season. Relevant only when method = 'mech'

mod_name

Named vector of strings specifying the model-level spatial patch. If is.null(mod_name), the code reverts to using RegState (see next entry). To specify New York state, set mod_name=c(NAME_2="United States", NAME_3="R1", NAME_4="New York"). Here NAME_X is either the full name or abbreviation of the level-X patch. Replacing 'United States' with 'US' or 'R1' with 'Region 1' would result in the same outcome. Also, vector entries for mod_name should go from NAME_2,....,NAME_n where mod_level=n.

RegState

Single element: determines which single region from mod_level is to be modeled. Depending on the model level, RegState should adhere to the following format: mod_level = 2 - 3-letter ISO3 RegState code, mod_level=3 - an integer describing the HHS region, mod_level=4 - a 2-letter state code. Where possible, RegState should be replaced by mod_name. RegState is limited to country-level data and US regions/states.

subDir

Name of output sub-directory where all plots and files will be written. Default is NULL -let the code build it. reproducible.

plot

TRUE, FALSE or EXTERNAL (or 0, 1, 2) allows the Users to implement their own plotting routines

Tg

- recovery time in days. If NULL it is set to three/eight days for flu/dengue. Relevant only when method = 'mech'

epi_model

String , name of mechanistic compartmental model: SIR, SEIR, (SIR)H/(SI)V, (SEIR)H/(SEI)/V SIRB integer 1,2,3,4,5 (case insensitive)

disease

String - disease name. Options for modeling are: flu, dengue, yellow$\_$fever, ebola, zika, cholera, chik, plague. To graphically explore the data see: predsci.com/id$\_$data/. A full list of diseases in the DICE database can be found from an R-prompt by following one of the examples below.

db_opts

A list of database options. $DICE_db Determines which SQL database the data is retrieved from. 'PredSci' is the default SQL database, 'BSVE' is in development. Additional flags are for outside sources of data (currently only the CDC Influenza-Like_Illness (ILI) is supported: $CDC_server=TRUE).

arima_model

- A List of ARIMA model parameters: list(p=, d=, q=, P=, D, Q=) can be set to NULL to trigger the auto.arima process

method

String either 'mech' for compartmental mechanistic models or 'stat' for SARIMA models

covar

String, optional. Covariate for use in ARIMA fitting. Options are: 'sh', 'precip', 'temp'

covar_lag

Numeric lag time for optional covariate variable in time units of cadence of the data

fit_name

A character vector indicating which fit-regions to use. If fit_name='all', then DICE uses all child-regions of the model region with level equal to fit_level. The other mode for fit_name is to specifiy a subset of the fit regions to construct an aggregate representation of the model region. For example if mod_level=c(NAME_2="US"), mod_level=2, fit_level=3, and fit_names=c("R1", "R2", "R3"), DICE will create an Atlantic super-region to model (as opposed to using all 10 HHS regions). Similarly, if mod_level=c(NAME_2="US"), mod_level=2, fit_level=4, and fit_names=c("WA", "OR", "CA"), DICE will create and model a super-state of Pacific states.

Value

solution a list with the input and entire output of the run.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
For a run of the 2015-2016 cdc national mydata using the ten HHS regions with coupling between the regions use:
output <- runDICE(data_source='cdc', year = 2015, mod_level = 2, fit_level = 3, isingle = 0)

For a run of the 2015-2016 cdc national mydata using the ten HHS regions without coupling between the regions use:
output <- runDICE(data_source='cdc', year = 2015, mod_level = 2, fit_level = 3, isingle = 1)

For a run of the 2014-2015 GFT mydata for HHS region number 9, using state level mydata with coupling between the states in region 9 use:
output <- runDICE(data_source='gft', year = 2014, mod_level = 3, fit_level = 4, RegState = 9, isingle = 0)

To control which model is used for the basic reproduction number, set the parameter model in your call. Default value is 5:
output <- runDICE(data_source='gft', year = 2014, mod_level = 3, fit_level = 4, RegState = 9, isingle = 0, model = 3)

To control the number of MCMC chains that the code will run set the parameter  nreal in your call, default is 1:
output <- runDICE(data_source='cdc', year = 2015, mod_level = 2, fit_level = 3, isingle = 0, nreal = 3)

To control the number of MCMC steps/trial in each chain set the parameter nMCMC in your call, default is 1e5:
output <- runDICE(data_source='cdc', year = 2015, mod_level = 2, fit_level = 3, isingle = 0, nMCMC = 1e6)

To control the name of the sub-directory where all the output files and plots are saved use the keyword subDir, default is output:
output <- runDICE(data_source='cdc', year = 2015, mod_level = 2, fit_level = 3, isingle = 0, nMCMC = 1e6, subDir = 'test')

To control the file format for the plots (pdf, png or x11) set the parameter device:
output <- runDICE(data_source='cdc', year = 2015, mod_level = 2, fit_level = 3, isingle = 0, nMCMC = 1e6, device = 'pdf')
(The package can accept an array of file formats, i.e. device = c('pdf','png'), in which case more both 'png' and 'pdf' files will be created.)

To run in a forecast or predictive mode you can set the number of weeks the code uses in the fit to be lower than the number of weeks in the season.
(Note that for the current season it is always running in a predictive mode because the season is not yet completed.)
output <- runDICE(data_source='gft', year = 2013, mod_level = 3, fit_level = 4, isingle = 1, nMCMC = 1e6, nfit = 35)

To select only a few HHS regions and run them coupled (for example the Eastern Regions 1, 2 and 3) use:
output <- runDICE(data_source='cdc', year=2015, mod_level=2, fit_level=3, RegState=c('Region1','Region2','Region3'), isingle = 0)

To select only a few states and run them coupled  use for example:
output <- runDICE(data_source='gft',y ear=2014, mod_level=3, fit_level=4, RegState=c('WA','OR','CA'), isingle = 0)

-- Data diseases and data_sources -------
Access the database and list all available diseases:
library(DICE)
myDB = OpenCon()
data_sources = dbReadTable(con=myDB, name="data_sources")
unique(data_sources$disease)
# then list all data sources
str(data_sources)
data_sources$source_abbv
dbDisconnect(myDB)

predsci/DICE documentation built on Aug. 9, 2019, 9:41 a.m.