In nikosbosse/epipredictr: What the Package Does (One Line, Title Case)

N. Bosse (1), S. Abbott (1), J. D. Munday (1), J. Hellewell (1), CMMID COVID team (1), S. Funk (1).

Correspondence to: nikos.bosse@lshtm.ac.uk

1. Center for the Mathematical Modelling of Infectious Diseases, London School of Hygiene & Tropical Medicine, London WC1E 7HT, United Kingdom

Last Updated: r (Sys.Date() - 1)

Note: this is preliminary analysis, has not yet been peer-reviewed and is updated daily as new data becomes available. This work is licensed under a Creative Commons Attribution 4.0 International License. A summary of this report can be downloaded here

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>", 
  echo = FALSE, eval = TRUE,
  #fig.width = 6, fig.height = 3,
  message = FALSE,
  warning = FALSE,
  error = TRUE,
  dpi = 320,
  fig.path = "figure/"
)

# library(epipredictr)
# library(rstan)
# options(mc.cores = 4)
# rstan_options(auto_write = TRUE)
library(ggplot2)
library(scoringutils)
library(dplyr)
library(cowplot)
library(patchwork)
library(bsts)
par(family = "Serif")
source("../R/utilities.R")
source("../R/forecast.R")
source("../R/incidence_prediction.R")
source("../R/scoring.R")
source("../R/plotting.R")

# analysis <- readRDS("../data/analysis/analysis.rds")
# all_scores <- readRDS("../data/analysis/all_scores.rds")
# aggregate_scores <- readRDS("../data/analysis/aggregate_scores.rds")
# plot_scoring <- readRDS("../data/analysis/plot_scoring.rds")
# plot_predictions <- readRDS("../data/analysis/plot_predictions.rds")
# predicted_incidences <- readRDS("../data/analysis/predicted_incidences.rds")

# models <- c("local", "semilocal", "local_student", "ar1", "ar2")
# data <- list(inputdata = inputdata, 
#            last_date = max(inputdata$date),
#              models = models,
#              incidences = incidences,
#              n_pred = 7,
#              start_period = 4)

# analysis <- full_analysis(data)
# full_predictive_samples <- analysis$full_predictive_samples
# 
# all_scores <- scoring(data, analysis$full_predictive_samples)
# 
# aggregated_scores <- aggregate_scores(all_scores)
# 
# 
# 
# plot_scoring <- plot_scoring(data, aggregated_scores)




# predict_incidences <- predict_incidences(data, full_predictive_samples)


# 
# prediction_results <- lapply(
#   seq_along(analysis$countries),
#   FUN = function(i) {
#     region <- analysis$countries[i]
#     results_region_model <-
#       analysis$all_region_results[[region]]$region_results$average
#     forecast_table(results_region_model, region)
#   }
# )
# 
# summary_prediction_result <- do.call(rbind, prediction_results)

Summary

Aim

Forecast the time-varying $R_t$ estimates into the future to get a sense of the trajectory of the COVID-19 epidemic in different countries.

Abstract

some abstract.

Key Results {.tabset}

Forecasts for regions with highest forecast

plot with incidences on top Rt estimeates below for those regions

Forecast summary table

1, 3, 7, 14 day ahead CI

Methods {.tabset}

Summary

We used different Bayesian Structural Time Series Models from the R package bsts to forecast transmission rates ($R_0$).
Transmission rates were obtained according to https://cmmid.github.io/topics/covid19/current-patterns-transmission/global-time-varying-transmission.html.
The following Bayesian Structural Time Series Models were used:
BSTS with a local trend
BSTS with a global trend
BSTS with a local trend and a Student's-t-distribution
BSTS with an AR 1
BSTS with an AR 2
Results were assessed for calibration, bias, and sharpness and scored using the proper scoring rules CRPS and LogS with the scoringRules package.

Limitations

$R_t$-estimates

All limitations for estimating $R_t$ apply, especially:

others

time horizon
linear trends --> would be good to have sth quadratic or sth following the derivative

Detail

Obtaining Estimates for $R_0$

see https://cmmid.github.io/topics/covid19/current-patterns-transmission/global-time-varying-transmission.html We obtain a time series with the median of estimated $R_t$ values. This time series is used for predictions.

Predicitions

The time series is then forecasted into the future using the R package bsts. Forecasts are made every day for up to 7 days ahead. This allows us to validate the predictions against actual observations when they come in. For every time point we have multiple predictive samples, each based on different data x days ago, where x corresponds to the number of days forecasted into the future. The following models were used:

BSTS with a local linear trend

The time series is modeled as $$\mu_t+1 = \mu+t + \delta_t + \epsilon_t, \quad \epsilon_t \sim \mathcal{N}(0, \sigma_{\mu}), $$ $$\delta_t = \delta_t + \eta_t, \quad \eta_t \sim \mathcal{N}(0, \sigma_{\delta}). $$ The local linear trend model therefore assumes that both the mean of the time series as well as the slope follow random walks.

BSTS with a local linear trend and Student's-t-distributed errors

The model specification is very similar to the local linear trend model. The only change is that random errors now follow a t-distribution, which implies thicker tails and therefore more room for extreme changes. $$\mu_t+1 = \mu+t + \delta_t + \epsilon_t, \quad \epsilon_t \sim \mathcal{T}{\nu{\mu}}(0, \sigma_{\mu}), $$ $$\delta_{t+1} = \delta_t + \eta_t, \quad \eta_t \sim \mathcal{T}{\nu{\delta}}(0, \sigma_{\delta}). $$ $\nu_{\mu}$ and $\nu_{\delta}$ are parameters that determine the thickness of the tails.

BSTS with a semi-local linear trend

The semi-local linear trend model assumes that the mean of the time series moves according to a random walk with slope. The slope component is an AR1 process centred on a constant trend D. For any non-zero value $\phi$, the slope will eventually become a constant number for long time horizons, resulting in a linear trend for the overall time series. Values of $\phi$ closer to one imply that the time series (or forecasts made by the model, respectively) will converge quicker to a purely linear trend. The time series is modeled as $$\mu_t+1 = \mu+t + \delta_t + \epsilon_t, \quad \epsilon_t \sim \mathcal{N}(0, \sigma_{\mu}), $$ $$\delta_{t+1} = D + \phi (\delta_t -D), \quad \eta_t \sim \mathcal{N}(0, \sigma_{\delta}). $$

Maybe: plots for different trend values we made earlier.

BSTS with an AR1 and AR2 state component

The mean of the time series is modeled as an AR(1)-process (or AR(2), respectively). The AR(1) model looks like this:

$$\alpha_t = \phi_1 \alpha_{t-1} + \epsilon_{t-1}, \quad \epsilon_t \sim \mathcal{N}(0, \sigma_{\delta}). $$ The AR(2) model looks like this: $$\alpha_t = \phi_1 \alpha_{t-1} + \phi_2 \alpha_{t-2} + \epsilon_{t-1}, \quad \epsilon_t \sim \mathcal{N}(0, \sigma_{\delta}). $$

Model stacking

yet to be implemented

Scoring

PIT and calibration

bias

sharpness

CRPS

DSS

Scoring {.tabset}

Across all countries {.tabset}

Graph

Scoring table

By country

same es before

Detailed Results by country {.tabset}

Japan {.tabset}

Prediction vs. Actual

analysis$country_results$Japan$forecast_plot

Prediction Scores

nikosbosse/epipredictr documentation built on April 7, 2020, 9:51 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

nikosbosse/epipredictr
What the Package Does (One Line, Title Case)

In nikosbosse/epipredictr: What the Package Does (One Line, Title Case)

Summary

Aim

Abstract

Key Results {.tabset}

Forecasts for regions with highest forecast

Forecast summary table

Methods {.tabset}

Summary

Limitations

$R_t$-estimates

others

Detail

Obtaining Estimates for $R_0$

Predicitions

BSTS with a local linear trend

BSTS with a local linear trend and Student's-t-distributed errors

BSTS with a semi-local linear trend

BSTS with an AR1 and AR2 state component

Model stacking

Scoring

PIT and calibration

bias

sharpness

CRPS

DSS

Scoring {.tabset}

Across all countries {.tabset}

Graph

Scoring table

By country

Detailed Results by country {.tabset}

Japan {.tabset}

Prediction vs. Actual

Prediction Scores

R Package Documentation

Browse R Packages

We want your feedback!

nikosbosse/epipredictr What the Package Does (One Line, Title Case)

In nikosbosse/epipredictr: What the Package Does (One Line, Title Case)

Summary

Aim

Abstract

Key Results {.tabset}

Forecasts for regions with highest forecast

Forecast summary table

Methods {.tabset}

Summary

Limitations

$R_t$-estimates

others

Detail

Obtaining Estimates for $R_0$

Predicitions

BSTS with a local linear trend

BSTS with a local linear trend and Student's-t-distributed errors

BSTS with a semi-local linear trend

BSTS with an AR1 and AR2 state component

Model stacking

Scoring

PIT and calibration

bias

sharpness

CRPS

DSS

Scoring {.tabset}

Across all countries {.tabset}

Graph

Scoring table

By country

Detailed Results by country {.tabset}

Japan {.tabset}

Prediction vs. Actual

Prediction Scores

R Package Documentation

Browse R Packages

We want your feedback!

nikosbosse/epipredictr
What the Package Does (One Line, Title Case)