knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Here we provide an example of how to use focustools
to generate forecasts for COVID-19 Forecast Hub targets via time series approaches. First, load the focustools
package and other ancillary packages.
library(focustools) library(dplyr) library(readr) library(purrr) library(fabletools) library(fable)
First, let's retrieve COVID-19 data from the JHU source. Note that the get_cases()
and get_deaths()
functions are executed separately, then joined and converted to a tsibble
. Note that granularity="state"
would pull in data at the state and territory level.
## Get data at the national scale from JHU source usac <- get_cases(source="jhu", granularity = "national") usad <- get_deaths(source="jhu", granularity = "national") ## Use the focustools helper to prep the tsibble format usa <- dplyr::inner_join(usac, usad, by = c("location", "epiyear", "epiweek")) %>% make_tsibble(chop=TRUE)
## Pull from internal data usa <- focustools:::usa_20210329
With the data prepared, we can now fit the models. In this case, we'll fit two models. One will be an ARIMA of incident cases, with parameters automatically optimized. For more details see ?fable::ARIMA
. In this case we will restrict the order of non-seasonal auto-regressive terms, order of integration for non-seasonal differencing, and order of non-seasonal moving average terms. The other model will use the three week lagged incident case counts to predict incident deaths:
## Fit incident case model fits.icases <- usa %>% model(arima = ARIMA(icases~PDQ(0,0,0)+pdq(1:2,0:2,0), stepwise=FALSE, approximation=FALSE)) ## Fit incident death model based on cases three weeks ago fits.ideaths <- usa %>% model(linear_caselag3 = TSLM(ideaths ~ lag(icases, 3)))
With the model fit objects created, we can generate forecasts (including point and quantile estimates) for the outcomes of interest at an arbitrary horizon (here, 4 weeks). Note that given the model we have chosen for the incident deaths outcome, we do need to make sure to generate the incident case forecast first so that we can retrieve point estimates for future cases, which will be passed into the incident deaths forecast:
## Generate incident case forecast forc.icases <- ts_forecast(fits.icases, outcome = "icases", horizon = 4) ## Need to get future cases to pass to ideaths forecast futr.icases <- ts_futurecases(usa, forc.icases, horizon = 4) ## Forecast incident deaths based on best guess for cases forc.ideaths <- ts_forecast(fits.ideaths, outcome = "ideaths", new_data = futr.icases)
We can use the forecasts of incident deaths to infer cumulative deaths. The ts_forecast()
function is flexible enough to switch methods internally, requiring a "inc_forecast" parameter if either "cdeaths" or "ccases" is passed as the "outcome" for the forecast:
## Generate cumulative death forecast forc.cdeaths <- ts_forecast(outcome = "cdeaths", .data = usa, inc_forecast = forc.ideaths)
Prior to submitting these forecasts to the COVID-19 Forecast Hub we need to prepare them in the canonical submission format. Data prep in format_for_submission
includes naming targets appropriately and ensuring that target dates adhere to epidemiological week format:
## Create submission object submission <- list(format_for_submission(forc.icases, target_name = "inc case"), format_for_submission(forc.ideaths, target_name = "inc death"), format_for_submission(forc.cdeaths, target_name = "cum death")) %>% reduce(bind_rows) %>% arrange(target) head(submission) tail(submission)
Lastly, we can validate the submission. The object must be written to a file and then validated with validate_forecast()
, which wraps a python function developed and maintained by the COVID-19 Forecast Hub organizers.
ffile <- file.path(tempdir(), paste0(Sys.Date(), "-focustools-ts.csv")) submission %>% write_csv(ffile) validate_forecast(ffile)
We can use plot_forecast()
to visualize these forecasts:
plot_forecast(.data=usa, submission=submission)
And we can use submission_summary()
to produce a summary of the forecasts for each target (cumulative deaths, incident cases, incident deaths):
subsum <- submission_summary(.data=usa, submission=submission) subsum
We could optionally coerce this structure into a table. For counts:
subsum$counts %>% bind_rows(.id="target") %>% knitr::kable()
Or percent difference:
subsum$perc_diff %>% bind_rows(.id="target") %>% knitr::kable()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.