pandemic_model: Bayesian growth curve models for epidemiological data via...

View source: R/pandemic_model.R

pandemic_modelR Documentation

Bayesian growth curve models for epidemiological data via Stan

Description

Bayesian inference for modeling epidemiological data or Covid-19 pandemic data using growth curve models. This function draws the posterior samples of the parameters of the growth curve models available in the PandemicLP package. The sampling algorithm is "NUTS", which is the No-U-Turn sampler variant of Hamiltonian Monte Carlo (Hoffman and Gelman 2011, Betancourt 2017).

See which models are available in the PandemicLP package in models.

See posterior_predict.pandemicEstimated to make predictions, pandemic_stats to provide a few useful statistics based on the predictions and plot.pandemicPredicted to plot the predicted values.

Usage

pandemic_model(
  Y,
  case_type = "confirmed",
  family = "poisson",
  seasonal_effect = NULL,
  n_waves = 1,
  p = 0.08,
  phiTrunc = 0,
  fTrunc = 1,
  chains = 1,
  warmup = 2000,
  thin = 3,
  sample_size = 1000,
  init = "random",
  prior_parameters = NULL,
  ...,
  covidLPconfig = FALSE
)

Arguments

Y

an object of class pandemicData-objects created by function load_covid, or function format_data. It is a list providing the epidemiological data for the model. The elements of this Y list are:

data:

a data frame with a date column and at least one of the following: cases, new_cases, deaths, new_deaths. Bellow are descriptions of each of these columns:

date:

a date vector. It should be of class 'Date' and format 'YYYY-MM-DD'.

cases:

a numeric vector with the time series values of the cumulative number of cases.

new_cases:

a numeric vector with the time series values of the number of new confirmed cases.

deaths:

a numeric vector with the time series values of the cumulative number of deaths.

new_deaths:

a numeric vector with the time series values of the number of new deaths.

The data frame should be ordered by date in ascending order.

name:

a string providing the name of Country/State/Location of the epidemiological data.

population:

a positive integer specifying the population size of the Country/State/Location selected.

For formatting epidemiological data (not provided by the load_covid function) in the specified Y list format, see the format_data function or the Examples section in covid19BH.

case_type

a string providing the type of cases of interest in modelling the epidemic. Current options are "confirmed" for confirmed cases or "deaths" for deaths. The default is "confirmed". This argument is not required when data frame Y$data (on the input argument Y) contains only information from one of the data series new_cases or new_deaths.

family

"poisson" or "negbin". This argument indicates the data distribution. The default is family="poisson".

seasonal_effect

string vector indicating the days of the week in which seasonal effect was observed. The vector can contain the full weekday name (sunday to saturday) or the first 3 letters, up to a maximum of three weekdays. For details go to models.

n_waves

a integer positive. This argument indicates the number of waves to be adjusted by mean curve. The default is 1. For details go to models.

p

a numerical value greater than 0 and less than or equal to 1. It is the percentage of the maximum cumulative total number of cases until the end of the epidemic in relation to the population of the location. The default is p = 0.08. This is a model restriction. See more on the models.

phiTrunc

a positive real number (or zero). This argument indicates a truncation on the priori of the 'phi' parameter of the Negative Binomial models. This input argument is required only when family="negbin". The default is phiTrunc=0. See more on the models.

fTrunc

a positive real number (or zero). This argument indicates a truncation on the priori of the 'f' parameter of the Negative Binomial model with single wave. This input argument is required only when family="negbin". The default is fTrunc=1. See more on the models.

chains

a positive integer specifying the number of Markov chains. The default is 1, which is default value used by the CovidLP app (http://est.ufmg.br/covidlp/home/en/).

warmup

a positive integer specifying the number of warmup (aka burnin) iterations per chain. These warmup samples are not used for inference. The default is 2000, if family="negbin" the value default becomes warmup=5000.

thin

a positive integer specifying the period for saving samples. The default is 3, which is the default value used by the CovidLP app (http://est.ufmg.br/covidlp/home/en/).

sample_size

a positive integer specifying the posterior sample's size per chain that will be used for inference. The total number of iterations per chain is:

warmup + thin * sample_size

The default is 1000, which is the default value used by CovidLP app (http://est.ufmg.br/covidlp/home/en/).

init

specification of the initial values of the parameters per chain. The default is "random". Go to models for more info about model parameters. Any parameters whose values are not specified will receive initial values generated as described in init = "random". Specification of the initial values for pandemic_model can only be via list. See the detailed documentation for the init argument via list in stan. Alternatively it can be an output of the pandemic_model() function, which uses the last stored iteration from that object as the initial values. If the models are different, an analogy is made.

prior_parameters

Either NULL or a list. If NULL default prior parameters are used. If a list must contain adequate values for the prior parameters. See models for details.

...

other arguments passed to the function. These are optional arguments for the sampling (rstan package). Additional arguments can be control, cores, etc...

covidLPconfig

TRUE or FALSE: flag indicating whether to use default values of the CovidLP app as input arguments. This argument is disabled when family="negbin".

If covidLPconfig = TRUE, the sampling uses the following configuration: chains = 1, warmup = 5000, thin = 3, sample_size = 1000,

control = list(max_treedepth = 50, adapt_delta = 0.999), p = 0.08 for

case_type = "confirmed" or p = 0.02 for case_type = "deaths", init a list with default initial values for the parameters of each model available.

When using covidLPconfig = TRUE the convergence of the chains is not guaranteed. It only replicates the results of the fitted model with the contemplated data in the CovidLP app (http://est.ufmg.br/covidlp/home/en/). For covidLPconfig = FALSE: each argument will be set to its default value, unless the user specifies otherwise.

Value

An object of S3 Class pandemicEstimated-objects representing the fitted results. The fit component of the pandemicEstimated class is an object of S4 Class stanfit.

References

CovidLP Team, 2020. CovidLP: Short and Long-term Prediction for COVID-19. Departamento de Estatistica. UFMG, Brazil. URL: http://est.ufmg.br/covidlp/home/en/

See Also

load_covid, posterior_predict.pandemicEstimated, pandemic_stats and plot.pandemicPredicted; summary.pandemicEstimated. See which models are available in the PandemicLP package in models.

Examples

##result of the pandemic_model function may take a few minutes

### generalized logistic poisson model: ###############
## Not run: 
Y0=load_covid(country_name="Brazil",state_name="SP",last_date='2020-04-25')
plot(Y0,cases="new")
output0=pandemic_model(Y0)
print(output0)
#convergence diagnostics
traceplot(output0)
density(output0)
stan_ac(output0$fit,pars=c("a","b","c","f"))

Y1=load_covid(country_name="Brazil",state_name="SP",last_date='2020-06-18')
plot(Y1,cases="new")
output1=pandemic_model(Y1,case_type="deaths",covidLPconfig=TRUE)
print(output1)
#convergence diagnostics
traceplot(output1)
density(output1)
stan_ac(output1$fit,pars=c("a","b","c","f"))


Y2=load_covid(country_name="Argentina",last_date='2020-05-07')
plot(Y2,cases="new")
output2=pandemic_model(Y2,covidLPconfig=TRUE)
print(output2)
#convergence diagnostics
traceplot(output2)
density(output2)
stan_ac(output2$fit,pars=c("a","b","c","f"))


#including initial values for parameters:
inits3=list(
 list(a=95,b=0.8,c=0.3,f=1.1)
)
output3=pandemic_model(Y2,init=inits3,chains=1,warmup=3000)
print(output3)
#convergence diagnostics
traceplot(output3)
density(output3)
stan_ac(output3$fit,pars=c("a","b","c","f"))

#initival values for 2 chains:
inits4=list(
 list(a=95,b=0.8,c=0.3,f=1.1), list(f=1.01)
)
output4=pandemic_model(Y1,init=inits4,chains=2,warmup=3000)
print(output4)
# show all initival values input by user:
output4$config.inputs$use_inputs$init
#convergence diagnostics
traceplot(output4)
density(output4)
stan_ac(output4$fit,pars=c("a","b","c","f"))

### seasonal model: ###############
output5=pandemic_model(Y0,seasonal_effect=c("sunday","monday"))
print(output5)
#convergence diagnostics
traceplot(output5)
density(output5)
stan_ac(output5$fit,pars=c("a","b","c","f","d_1","d_2"))

## or, for 'seasonal_effect': strings vector with the 3 initial letters of the weekday(s)
Y3=load_covid(country_name="Brazil",state_name="MG",last_date='2020-09-05')
plot(Y3,cases="new")
#weekdays effect : sunday and monday:
output6=pandemic_model(Y3,seasonal_effect=c("sun","mon"),covidLPconfig=TRUE)
print(output6)
#convergence diagnostics
traceplot(output6)
density(output6)
stan_ac(output6$fit,pars=c("a","b","c","f","d_1","d_2"))

### multi_waves(2) model: ######################
Y4=load_covid(country_name="United States of America",last_date='2020-09-27')
plot(Y4,cases="new")
output7=pandemic_model(Y4,n_waves=2,covidLPconfig=TRUE)
print(output7)
#convergence diagnostics
traceplot(output7)
density(output7)
stan_ac(output7$fit,pars=c("a1","b1","c1","alpha1","delta1","a2","b2","c2","alpha2","delta2"))

## End(Not run)



PandemicLP documentation built on March 18, 2022, 6:22 p.m.