In mrc-ide/esft: WHO ESFT Calculations

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(esft)

Overview

The goal for this vignette is to be stand alone documentation so that you do not have to open up any other vignette or document in order to run the full set of analyses.

I hope to additionally add vignettes with sources, greater details, and explanations of equations, so that if any further questions arise, they can help answer them, or provide greater detail on customizing the outputs.

Package build warning

There is one overlapping function between two of the dependencies that are loaded. This warning can be ignored:

replacing previous import ‘dplyr::last’ by ‘data.table::last’ when loading ‘esft’

Order of calculations

User input

Here the user would specify the same items that are in the User Input tab of the ESFT excel file, which would include the country, the forecast period, the first date of the forecast, and potentially the delivery lead time (although that is currently) not included. Note: the time period for which we can forecast is between 2020-01-01 to 2022-12-31, and is limited to the countries for which have model fits.

Load data

There are three options here:

Using the data loaded with the package (icl_data), subset according to country name and transmission scenario.
Downloading the data from GitHub, using the load_imperial_data function - although this is likely to be less sustainable in the long term, as we do not know how long the fits will be up
Downloading infections and deaths estimates from the WHO website. Either you can go directly to the first link on the website, or use the pre-loaded dataset called who_data, which was downloaded in July 2023.
Parameter setting

This category of calculations refers to the universal parameters that stay the same across groups of countries, and tend to be based either on collated clinical opinion, on available literature, or estimates from available data.

Capacity mapping

This category refers to estimates based on data collated from various sources, that is country specific. Where recent or country-specific estimates are not available, this group of calculations tends to provide estimates based on the World Bank income group to which the specified country belongs.

Weekly summaries

This includes summaries of cases, health care workers, tests, and patients per week. In short, these are 'people' calculations.

Forecasts

These are 'object' calculations. This includes commodity forecasts (which includes hygiene, case management, PPE, and diagnostic commodities), pharmaceutical forecasts, and non-COVID 19 essential equipment forecasts.

Right now there is no option for users to input existing equipment or capacity. Another limitation is that right now the order of date subsetting heavily influences the outcomes. Future work would be to rewrite calculations to address those two issues.

User Input

Here we set the user inputs and country specified. Note there is an option to manually specify the country code in most calculations, but this helps for consistency/clarity.

user <- user_input()

# we set the country to be afghanistan
country <- "AFG"

Load Data

In order to acquire the data used to inform the forecasts of the model, the first step is to acquire it. As mentioned before, there are three options to acquire the data: download it directly from github, load and subset the imperial data that comes with the package, load and subset the who data that comes with the package, or manually download the WHO data here at the first download link.

For the ICL data from GitHub, the scenario will default to medium, which is equivalent to a scenario where there are no changes in transmission and the status quo is maintained. These fits contain the death calibrated data. The preloaded data also contains death calibrated data but needs to be subset by scenario and country (see the transmission_scenarios dataset for more info). The preloaded who data only needs to be subset by country name.

# This will download specific country and scenario fits
imperial_download <- load_imperial_data(country_code = country)
head(imperial_download)

# This includes all data, so needs to be subset by country and scenario
data(icl_data, package="esft")
data(transmission_scenarios, package = "esft")

scenario = "Medium"
scenario_label <-
    transmission_scenarios$imperial_category_labels[transmission_scenarios$imperial_scenario == scenario]

icl_data <- subset(icl_data, icl_data$iso3c == country)
icl_data <- subset(icl_data, icl_data$scenario == scenario_label)

# This includes all country data (no scenario here as this is reported data), so needs to be subset by country name (as the country code is in a different format)
data(who_data, package="esft")
country.name <- countrycode::countrycode(sourcevar = country, origin = "iso3c", destination = "country.name")
who_data <- subset(who_data, who_data$Country == country.name)

Parameter Setting

We then obtain the parameters. We can manually set any parameters by passing a named list of parameters we would like to use to any of the functions, which we name overrides.

# This is the standard list of all parameters, many from inputs tab of excel sheet
params <- get_parameters()

head(params)

# Testing strategy parameters - population level testing parameters (i.e. percent
# tested)
test_strat <- set_testing_strategy()

head(test_strat)

# Diagnostic parameters - individual level testing parameters (i.e. number of tests
# per case per diagnosis, etc.)
test_params <- get_diagnostic_parameters()

head(test_params)

# Diagnostic lab parameters - number of lab staff per lab, safety boxes per unit,
# etc.
lab_params <- get_lab_parameters()
head(lab_params)

Example of how to pass overriding parameters to the functions.

# initialising list in the function call
params <- get_parameters(overrides = list(
  stay_mild = 3,
  o2_flow_sev = 5
))
# initialising list outside the function call (if many parameters to be overridden)
overrides <- list(perc_crit_inv_mv = 0.3)
params <- get_parameters(overrides)

Capacity Mapping

Once we get the parameters, it's time to use the country name and some pre-loaded datasets to calculate how many machines each country already has and thus how many tests they can process.

It is not strictly necessary to calculate capacity after setting the parameters, but thematically it makes sense to start with broad parameter setting and then drill down into country specific details.

# load data for the diagnostic capacity functions
data(throughput, package = "esft")
data(hours_per_shift, package = "esft")

# This functions pulls in data from the UN, the World Bank, the WHO, Imperial
# College London to collect country wide parameters which include population,
# number of HCWs, number of lab staff, etc.
capacity <- get_country_capacity(iso3c = country)

# This subsets the diagnostics dataframe to find the baseline estimates of
# diagnostic testing capacity provided in the WHO ESFT, in terms of number of
# modules (which are sub units of machines) activated for COVID-19 test processing.
country_test_capacity <- get_country_test_capacity(iso3c = country)

# This function then translates modules activated to total tests and total tests
# for COVID-19 that can be processed per day, given assumptions about hours per
# shift, number of shifts a day, how many tests can be processed in different
# shifts, how much the machines capacity are allocated to COVID-19 processing.
diagnostic_capacity <- calc_diagnostic_capacity(
  country_diagnostic_capacity = country_test_capacity,
  throughput,
  hours_per_shift = hours_per_shift,
  shifts_per_day = 1
)

# Calculates the total labs available for COVID-19 given the diagnostic capacity
# (number of modules and machines) available
t_labs <- total_labs(diagnostic_capacity)

# Uses the country specific diagnostic capacity estimates to calculate the max
# number of tests per day. This is then used to help cap tests by total testing
# capacity in total_tests.
max_tests <- max_tests_per_day(diagnostic_capacity)

Weekly Summaries

Up until now the outputs have been pretty easily modifiable and verifiable - these outputs tend to not change with time. However, now we get to the meat of the calculations, which are highly dependent on the input data provided to them. Depending on the country, the time during the pandemic, in addition to the time at which the model outputs were run, there will be different forecasted need. Furthermore, depending on when you set your start forecast, there will be different outputs. Additionally, the original ESFT had a different calculation for HCW caps than this one (it was assumed that the ESFT had an error, as it capped the HCWs twice), and so outputs related to HCWs are different here (but exact outputs can be replicated by passing the HCW caps manually as an overrides list).

That is all to say, while these calculations have been painstakingly manually checked and line up with the calculations in the ESFT, just like the ESFT, the outputs are highly sensitive to input parameters.

# Need to load the data within the same code snippet, or else an error is thrown
mydata <- load_imperial_data(country_code = country)

# This calculates new and cumulative cases weekly broken down by severity, by
# making certain assumptions about the proportions of infections and incidence
# at hospitals and ICUs that are of different case severities
cases <- cases_weekly(params,
  capacity,
  test_strategy_params = test_strat,
  data = mydata,
  user = user,
  data_source = "Imperial"
)

# Cases_weekly does not give the outcome with the first row = week1, but does 
# contain all the information required for the next step. With these
# removed, the outcomes would not map directly onto the ESFT excel sheet.

head(cases)

# Patients will still give a leading 0 for discharged_crit_patients, as this is dependent
# on the previous week and the structure of the calculation.
patients <- patients_weekly(params,
  capacity,
  data = cases,
  data_source = "Imperial",
  user = user
)

# Patients are additionally subset by the forecast period, as the next steps (HCW caps and HCWs weekly) depend on the forecast period as a whole.

head(patients)

# We can use the patients and previously set parameters to calculate the tests
# required per week per different categories
tests <- diagnostics_weekly(
  params = params,
  patients,
  cases,
  diagnostic_parameters = test_params,
  testing_scenario = test_strat
)

head(tests)

# Loading Healthcare Work Force Estimates
data(hwfe, package = "esft")

# Note: in the ESFT spreadsheet, there is an error in the HCW caps calculations:
# the HCWs are subject to two levels of caps. In this package, the HCWs are only
# capped once. In order to replicate the calculations exactly, we manually pass
# on the caps calculated in the spreadsheet (these are for Afghanistan) through
# an overrides list.
caps <- list(
  hcws_inpatients_cap = 5448,
  hcws_screening_cap = 919
)

hcw_caps <- hcw_caps(params,
  capacity,
  throughput,
  hwfe,
  patients,
  overrides = caps
)

head(hcw_caps)

# We can then use the HCW caps to estimate the HCWs required weekly, based on
# HCWs available, parameters, levels of illness in the country, and other elements
hcws <- hcws_weekly(
  params,
  capacity, # from get_country_capacity
  lab_params, # get_lab_parameters
  tests, # from diagnostics_weekly
  patients, # patients_weekly
  t_labs, # total_labs
  hcw_caps
)

head(hcws)

# We use the capacity estimates, tests, and caps to calculate the number of HCWs
# needed weekly to screen COVID-19 patients
screening_hcws <- screening_hcws_weekly(
  tests, hcw_caps,
  capacity
)

# We also calculate the tests for contacts and HCWs every week, in addition to
# the tests of the populations for diagnosis and release
added_tests <- additional_testing(
  hcws, # from hcws_weekly
  screening_hcws, # from screening_hcws_weekly
  test_strat, # from set_testing_strategy
  tests
)

# We use these inputs to calculate the total number of tests per week, capped or
# uncapped by testing capacity
n_tests <- total_tests(tests, added_tests, max_tests)

# This can be calculated using only parameter and capacity level calculations.
# Basically gives the percentage of allocations of the different types of machines
# to COVID-19 processing.
test_ratios <- test_ratio(diagnostic_capacity, test_params)

Forecasts

Now that we have the weekly summaries in addition to some intermediary calculations, we can go on to calculating what we might need to respond to this epidemic: i.e. the pharmaceutical, equipment, and non-covid essential items forecasts.

# First, we start with loading the reference data for items required per case or
# per HCW
data(noncovid, package = "esft")
data(pharmaceuticals, package = "esft")
data(equipment, package = "esft")
data(who, package = "esft")
data(throughput, package = "esft")

# We start then by calculating the number of HCWs in the country, based on
# WHO data sources
ref_hcws <- reference_hcw(iso3c = "AFG", params, who, throughput)

# These outputs are used to calculate the number of items needed for the
# non-COVID-19 focused HCWs.
noncovid_ess <- noncovid_essentials(noncovid, ref_hcws)

head(noncovid_ess)

# Case counts and pharmaceutical recommendations are used to forecast total
# pharmaceutical needs for the period at hand.
pharma <- pharma_forecast(
  pharmaceuticals,
  cases
)

head(pharma)

# Weekly summary outputs and the WHO recommendations are used to forecast the
# total hygiene equipment need each week during the forecast period.
hygiene <- hygiene_forecast(
  equipment,
  hcws,
  patients,
  cases,
  tests,
  screening_hcws,
  params
)

head(hygiene)

# Patient counts and WHO recommendations are used to forecast case
# management equipment need each week during the forecast period.
case_management <- case_management_forecast(
  equipment,
  patients
)

head(case_management)

# Weekly summary outputs and WHO recommendations are used to forecast total PPE
# demand each week during the forecast period.
ppe <- ppe_forecast(
  equipment,
  hcws,
  patients,
  cases,
  tests,
  screening_hcws,
  params
)

head(ppe)

# Testing outputs and WHO recommendations are used to forecast total diagnostic
# supply need each week during the forecast period.
diagnostic_supplies <- diagnostics_forecast(
  lab_params,
  equipment,
  test_ratios,
  n_tests,
  patients
)

head(diagnostic_supplies)