Hospital Capacity Planning Using Discrete Event Simulation: Introduction

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "%>"
)

Introduction

Data and Copyright Notes

The simulator requires two types of data:

  1. simulation data that describes the spread of the pandemic over time and
  2. field data that contains daily resource usage data.

Included in the package are tools to generate synthetic data, e.g., you can generate simulation and field data to run the simulations. This procedure is described in the paper "Optimization of High-dimensional Simulation Models Using Synthetic Data".

To demonstrate the usage of real-world data, we have included sample datasets from Germany.

Simulation Data

We have included a data sample from the German Robert Koch-Institute (RKI). Please take the following copyright notice under advisement, if you plan to use the RKI data included in the package:

Die Daten sind die „Fallzahlen in Deutschland“ des Robert Koch-Institut (RKI) und stehen unter der Open Data Datenlizenz Deutschland Version 2.0 zur Verfügung. Quellenvermerk: Robert Koch-Institut (RKI), dl-de/by-2-0
Haftungsausschluss: „Die Inhalte, die über die Internetseiten des Robert Koch-Instituts zur Verfügung gestellt werden, dienen ausschließlich der allgemeinen Information der Öffentlichkeit, vorrangig der Fachöffentlichkeit“.

Taken from here.

Field Data

We have included a data sample from the German DIVI Register. Please take the following copyright notice under advisement. The DIVI data are not open data. The following statement can be found on the DIVI web page:

Eine weitere wissenschaftliche Nutzung der Daten ist nur mit Zustimmung der DIVI gestattet. Therefore, only an example data set, that reflects the structure of the original data from the DIVI register, is included in the babsim.hospital package as icudata.

Packages

suppressPackageStartupMessages({
library("SPOT")
library("babsim.hospital")
library("simmer")
library("simmer.plot")
})

We need at least version 2.1.8 of SPOT.

packageVersion("SPOT")

Data used by babsim.hospital

We combine data from two different sources:

  1. simData: simulation data, i.e., input data for the simulation. Here, we will use data from the Robert Koch-Institute in Germany.
  2. fieldData: real data, i.e., data from the DIVI-Intensivregister. The field data is used to validate the output of the simulation.

The babsim.hospital simulator models resources usage in hospitals, e.g., number of ICU beds ($y$), as a function of the number of infected individuals ($x$). In addition to the number of infections, information about age and gender will be used as simulation input.

We will take a closer look at the required input data in the following sections.

Simulation Data: RKI Data

Get Data From the RKI Server

babsim.hospital provides a function to update the (daily) RKI data.

updateRkidataFile("https://www.arcgis.com/sharing/rest/content/items/f10774f1c63e40168479a1feb6c7ca74/data")

Users are expected to adapt this function to their local situation.

The downloaded data will be available as rkidata.

Due to data size limits on CRAN, the full dataset is not included in the babsim.hospitalpackage. Instead, we provide a subset of the Robert Koch-Institut dataset with 10,000 observations in the package.

str(babsim.hospital::rkidata)

Copyright notice for the data:

Die Daten sind die „Fallzahlen in Deutschland“ des Robert Koch-Institut (RKI) und stehen unter der Open Data Datenlizenz Deutschland Version 2.0 zur Verfügung. Quellenvermerk: Robert Koch-Institut (RKI), dl-de/by-2-0
Haftungsausschluss: „Die Inhalte, die über die Internetseiten des Robert Koch-Instituts zur Verfügung gestellt werden, dienen ausschließlich der allgemeinen Information der Öffentlichkeit, vorrangig der Fachöffentlichkeit“.

The rkidata can be visualized as follows (here region = 0 is Germany, region = 5 is North Rhine-Westphalia, region = 5374 Oberbergischer Kreis, etc.):

p <- ggVisualizeRki(data=babsim.hospital::rkidata, region = 5374)
print(p)

Preprocessed RKI Data

Not all the information from the original rkidata data set is required by the babsim.hospital simulator. The function getRkiData() extracts the subset of the raw rkidata required by our simulation, optimization, and analysis:

rki <- getRkiData(rki = rkidata)
str(rki)

As illustrated by the output from above, we use the following data: 1. Altersgruppe: age group (intervals, categories), represented as character string 1. Geschlecht: gender 1. Day: day of the infection 1. IdBundesland: federal state 1. IdLandkreis: county 1. time: number of days (0 = start data). It will be used as arrivalTimes for the simmer simulations. 1. Age: integer representation of Altersgruppe

Field Data (Real ICU Beds)

Get Data From the DIVI Server

Similar to the rkidata, which is available online and can be downloaded from the RKI Server, the field data is also available online. It can be downloaded from the DIVI Server as follows, where YYYY-MM-DD should be replaced by the current date, e.g, 2020-10-26.

updateIcudataFile("https://www.divi.de/joomlatools-files/docman-files/divi-intensivregister-tagesreports-csv/DIVI-Intensivregister_YYYY-MM-DD_12-15.csv") 

Note: The data structures on the DIVI server may change, so it might be necessary to modify the following procedure. Please check the hints on the DIVI web page. Contrary to the updateRkidataFile() function, which downloads the complete historical dataset, the updateIcudataFile() function only downloads data for a single date.

The downloaded data will be available in babsim.hospital as icudata.

Important: The DIVI dataset is not open data. The following statement can be found on the DIVI web page:

Eine weitere wissenschaftliche Nutzung der Daten ist nur mit Zustimmung der DIVI gestattet.

Therefore, only an example data set, that reflects the structure of the original data from the DIVI register, is included in the babsim.hospital package as icudata:

str(babsim.hospital::icudata)

The ìcudata can be visualized as follows (region = 0 is Germany, region = 5 is North Rhine-Westphalia, region = 5374 is the Oberbergischer Kreis, etc.)

p <- ggVisualizeIcu(region = 5374)
print(p[[1]])
print(p[[2]])

Additional Data

GV-ISys: Gemeindeverzeichnis-Informationssystem (GV-ISys) of the German Federal Statistics Office

Preprocessing DIVI/ICU Data

Note: ICU beds without ventilation can be calculated as faelle_covid_aktuell - faelle_covid_aktuell_beatmet.

The function getIcuBeds() converts the 9 dimensional DIVI ICU dataset icudata (bundesland,gemeindeschluessel,..., daten_stand) into a data.frame with three columns:

  1. bed
  2. intensiveBedVentilation
  3. Day
fieldData <- getIcuBeds(babsim.hospital::icudata)
str(fieldData)

The field data based on icudata uses two bed categories: 1. intensiveBed: ICU bed without ventilation 2. intensiveBedVentilation: ICU bed with ventilation

Performing Simulations

To run a simulation, the setting must be configured (seed, number of repeats, sequential or parallel evaluation, variable names, dates, etc.)

region = 5374 # Germany, 5315 is Cologne, 5 is NRW
seed = 123
simrepeats = 2
parallel = FALSE
percCores = 0.8
resourceNames =  c("intensiveBed", "intensiveBedVentilation")
resourceEval = c("intensiveBed", "intensiveBedVentilation")

We can specify the field data based on icudata (DIVI) for the simulation as follows:

FieldStartDate = "2020-09-01"
# Felddaten (Realdaten, ICU):
icudata <- getRegionIcu(data = icudata, region = region)
fieldData <- getIcuBeds(icudata)
fieldData <- fieldData[which(fieldData$Day >= as.Date(FieldStartDate)), ]
rownames(fieldData) <- NULL
icu = TRUE
icuWeights = c(1,1)

Next, simulation data (RKI data) can be selected. The simulation data in our example, depend on the field data:

SimStartDate = "2020-08-01"
rkidata <- getRegionRki(data = rkidata, region = region)
simData <- getRkiData(rkidata)
simData <- simData[which(simData$Day >= as.Date(SimStartDate)), ]
## Auch mit fieldData cutten damit es immer das gleiche Datum ist
simData <- simData[as.Date(simData$Day) <= max(as.Date(fieldData$Day)),]
## time must start at 1
simData$time <- simData$time - min(simData$time)
rownames(simData) <- NULL

Finally, we combine all field and simulation data into a single list() called data:

data <- list(simData = simData, fieldData = fieldData)

Configuration information is stored in the conf list, i.e., conf refers to the simulation configuration, e.g., sequential or parallel evaluation, number of cores, resource names, log level, etc.

conf <- babsimToolsConf()
conf <- getConfFromData(conf = conf,
                        simData = data$simData,
                        fieldData = data$fieldData)
conf$parallel = parallel
conf$simRepeats = simrepeats
conf$ICU = icu
conf$ResourceNames = resourceNames
conf$ResourceEval = resourceEval
conf$percCores = percCores
conf$logLevel = 0
conf$w2 = icuWeights
set.seed(conf$seed)

Simulation Model Parameters

The core of the babsim.hospital simulations is based on the simmer package.

It uses simulation parameters, e.g., arrival times, durations, and transition probabilities. There are currently 29 parameters (shown below) that are stored in the list para.

para <- babsimHospitalPara()
str(para)

Run simulation

rkiWithRisk <- getRkiRisk(data$simData, para)
head(rkiWithRisk)
arrivalTimes <- data.frame(time = rkiWithRisk$time,
                             risk = rkiWithRisk$Risk)
envs <- babsimHospital(arrivalTimes = arrivalTimes,
                         conf = conf,
                         para = para)

Visualize Output

Simmer Plots

resources <- get_mon_resources(envs)
resources$capacity <- resources$capacity/1e5
plot(resources, metric = "usage", c("bed", "intensiveBed", "intensiveBedVentilation"), items = "server")
plot(resources, metric = "usage", "bed", items = "server", steps = TRUE)
  1. The following command generates a plot of icu beds without ventilation:
plot(resources, metric = "usage", "intensiveBed", items = "server", steps = TRUE)
  1. The following command generates a plot of icu beds with ventilation:
plot(resources, metric = "usage", "intensiveBedVentilation", items = "server", steps = TRUE)

Evaluate Simulation Results

fieldEvents <- getRealBeds(data = data$fieldData,
                        resource = conf$ResourceNames)
res <- getDailyMaxResults(envs = envs,  fieldEvents = fieldEvents, conf=conf)
resDefault <- getError(res, conf=conf)

The error is r resDefault.

p <- plotDailyMaxResults(res)
plot(p)
plotly::ggplotly(p) 

Optimization

para <- babsimHospitalPara()
library("babsim.hospital")
library("SPOT")
library("simmer")
dir.create("results")
res202010262 <- runoptDirect(
  expName = paste0("test_", format(Sys.time(), "%Y_%b.%d_%H.%M_V"), utils::packageVersion("babsim.hospital"),"R"),
  region = 5374,
  rkiwerte = babsim.hospital::rkidata,
  icuwerte = babsim.hospital::icudata,
  TrainSimStartDate = Sys.Date() - 10*7, 
  TrainFieldStartDate = Sys.Date() - 6*7, 
  TestSimStartDate = Sys.Date() - 8*7, 
  TestFieldStartDate = Sys.Date() - 4*7, 
  Overlap = 0,
  seed = 101170,
  direct = TRUE,
  repeats = 1,
  funEvals = 40,
  funEvalsFactor = 0,
  size = 35,
  simrepeats = 1,
  parallel = TRUE,
  percCores = 0.9,
  icu = TRUE,
  icuWeights = c(1,1),
  verbosity=11,
  testRepeats = 1,
  tryOnTestSet = TRUE
)

Use Optimized Parameters

para <- getBestParameter(getParaSet(5315))
res <- modelResultHospital(para=para, 
                           conf=conf,
                           data = data)
resOpt <- getError(res, conf=conf)
p <- plotDailyMaxResults(res)
print(p)
plotly::ggplotly(p)

Smooth Parameters

para <- smoothParameter(getBestParameter(getParaSet(5374)), 
                getBestParameter(getParaSet(5)) )

Visualize and Analyze Parameter Settings

para <- babsimHospitalPara()
getMatrixP(para = para)
visualizeGraph(para=para, option = "P")
visualizeGraph(para = para, option = "D")
getMatrixD(para = para)

Extend RKI Data

data <- getRkiData(babsim.hospital::rkidata)
n <-  as.integer( max(data$Day)-min(data$Day) )
StartDay <- min(data$Day) + round(n*0.9)  
data <- data[which(data$Day >=  StartDay), ]
EndDate <- max(data$Day) + 2
dataExt <- extendRki(data = data, 
                     EndDate = EndDate,
                     R0 = c(0.1, 0.2))
visualizeRkiEvents(data = data, region=5374)
visualizeRkiEvents(data = dataExt, region = 5374)

Sensitivity Analysis

Quick Analysis

library("rpart")
library("rpart.plot")
library("babsim.hospital")
library("SPOT")
param <- getParaSet(5374)
n <- dim(param)[2] - 1
y <- param[,1]
x <- param[,2:dim(param)[2]]
fitTree <- buildTreeModel(x=x,
                 y=y,
                 control = list(xnames = paste0('x', 1:n)))
rpart.plot(fitTree$fit)
getParameterNameList(c(24, 25, 3, 10))
library("rpart")
library("rpart.plot")
param <- getParaSet(5315)
n <- dim(param)[2] - 1
y <- param[,1]
x <- param[,2:dim(param)[2]]
fitTree <- buildTreeModel(x=x,
                 y=y,
                 control = list(xnames = paste0('x', 1:n)))
rpart.plot(fitTree$fit)

Sensitivity Analysis

res <- res202010262[[2]][[1]]
xBest <- res$xbest
n <- length(xBest)
print(xBest)
n <- n-1
t(getParameterNameList(1:n))
SPOT::plotModel(res$modelFit, which = c(16,2) ,xlab = c("Modellierungsparameter (Varianz), GammaShapeParameter", "x2: Normalstation zu Genesen (AmntDaysNormalToHealthy)"))

A regression-based parameter screening can be performed to discover relevant (and irrelevant) model parameters:

fitLm <- SPOT::buildLM(x=res$x,
                 y=res$y,
                 control = list(useStep=TRUE))
summary(fitLm$fit)
getParameterName(7)
library("rpart")
library("rpart.plot")
fitTree <- buildTreeModel(x=res$x,
                 y=res$y,
                 control = list(xnames = paste0('x', 1:n)))
rpart.plot(fitTree$fit)

Estimate Risks

An exponential model with two parameters was chosen to model risk as a function of age and gender: $r(x) = a\exp(b\,x)$.

age <- c(2,10,25,47,70,90)
risk <- c(0.01,0.07,0.15,0.65,3,12.64)
fit <- nls(risk ~ a * exp( b * age), 
           start = list(a = 1, b = 0),
           control= nls.control(maxiter = 50, tol = 1e-05, minFactor = 1/1024,
                                printEval = FALSE, warnOnly = FALSE))
print(coef(fit))
{plot(age,2*risk)
    # female:
    lines(age, 1* predict(fit, list(x = age)))
    # male:
    lines(age, 2* predict(fit, list(x = age) ), col ="red")}

The RKI Data Classes

Raw data: rkidatafull

The full, unmodified RKI data set, can be downloaded from the RKI web page. Once downloaded, it is accessible as rkidataFull.

Note: rkidataFull is a large data set, which is not included in the CRAN version.

dim(babsim.hospital::rkidataFull)

Class rkidata

The rkidata data set is a subset of the rkidataFull data set. It contains data from 2020-09-01 until today. The rkidata subset is used, because the COVID-19 pandemic behavior changed over time. The period from September is sometimes referred to as the second wave. Note: the full rkidata set is not included in the CRAN version. The CRAN version includes a smaller data set:

str(babsim.hospital::rkidata)

Class simData

To convert data from the rkidata format, the function getRkiData() can be used. The function generates data that can be used as simmer arrival events, one arrival is listed in each row. Each arrival includes the following information:

Note: As mentioned earlier, the CRAN package does not include the full RKI dataset. Only a sample is included as rkidata To perform real simulations, the user has to download the full RKI data set as described above.

rkiSimData <- getRkiData(babsim.hospital::rkidata)
str(rkiSimData)

Adding risk information to simData class data

The function getRkiRisk(simData, para) adds a numerical risk value, which is based on Age and Geschlecht to simData data. The parameters RiskFactorA, RiskFactorB, and RiskMale from para are used for the risk calculation.

para <- babsimHospitalPara()
print(para$RiskFactorA)
print(para$RiskFactorB)
print(para$RiskMale)
rkiRiskSimData <- getRkiRisk(rkiSimData, para)
str(rkiRiskSimData)

The Class arrivaldata

arrivalTimes <- data.frame(time = rkiRiskSimData$time, risk = rkiRiskSimData$Risk)
str(arrivalTimes)

Optimization Details

Bounds

getParameterDataFrame()
bounds <- getBounds()
print(bounds)

References

\bibliography{../bab-bibfiles/bartzAll.bib}

Appendix

Run Scripts

The following R scripts demonstrate, how optimizations runs for the Oberbergische Kreis (OBK), Koeln, and NRW can be started.

Note: runs take several minutes/hours.

Oberbergischer Kreis (OBK)

library("babsim.hospital")
library("SPOT")
library("simmer")

runoptDirect(
  expName = paste0("obk_", format(Sys.time(), "%Y_%b.%d_%H.%M_V"), utils::packageVersion("babsim.hospital"),"R"),
  region = 5374,
  rkiwerte = babsim.hospital::rkidata,
  icuwerte = babsim.hospital::icudata,
  TrainSimStartDate = Sys.Date() - 10*7, # 11*7, #10*7, # "2020-09-03",
  TrainFieldStartDate = Sys.Date() - 6*7, # 8*7, # "2020-10-03",
  #TestSimStartDate = Sys.Date() - 8*7, # 6*7 , #"2020-09-23",
  #TestFieldStartDate = Sys.Date() - 4*7, #"2020-10-23",
  Overlap = 0,
  seed = 101170,
  direct = TRUE,
  repeats = 1000,
  funEvals = 1000,
  funEvalsFactor = 0,
  size = 250,
  simrepeats = 10,
  parallel = TRUE,
  percCores = 0.9,
  icu = TRUE,
  icuWeights = c(1,1),
  verbosity=11,
  testRepeats = 10,
  tryOnTestSet = FALSE
)

Koeln (Cologne)

library("babsim.hospital")
library("SPOT")
library("simmer")

runoptDirect(
  expName = paste0("koeln_", format(Sys.time(), "%Y_%b.%d_%H.%M_V"), utils::packageVersion("babsim.hospital"),"R"),
  region = 5315,
  rkiwerte = babsim.hospital::rkidata,
  icuwerte = babsim.hospital::icudata,
  TrainSimStartDate = Sys.Date() - 10*7, # 10*7, # "2020-09-03",
  TrainFieldStartDate = Sys.Date() - 6*7, # "2020-10-03",
  # TestSimStartDate = Sys.Date() - 8*7, # 6*7 , #"2020-09-23",
  # TestFieldStartDate = Sys.Date() - 4*7, #"2020-10-23",
  Overlap = 0,
  seed = 101170,
  direct = TRUE,
  repeats = 1000,
  funEvals = 1000,
  funEvalsFactor = 0,
  size = 250,
  simrepeats = 10,
  parallel = TRUE,
  percCores = 0.9,
  icu = TRUE,
  icuWeights = c(1,1),
  verbosity=11,
  testRepeats = 10,
  tryOnTestSet = FALSE
)

NRW

library("babsim.hospital")
library("SPOT")
library("simmer")

runoptDirect(
  expName = paste0("nrw_", format(Sys.time(), "%Y_%b.%d_%H.%M_V"), utils::packageVersion("babsim.hospital"),"R"),
  region = 5374,
  rkiwerte = babsim.hospital::rkidata,
  icuwerte = babsim.hospital::icudata,
  TrainSimStartDate = Sys.Date() - 10*7, #10*7, # "2020-09-03",
  TrainFieldStartDate = Sys.Date() - 6*7, # "2020-10-03",
  # TestSimStartDate = Sys.Date() - 8*7, #6*7 , #"2020-09-23",
  # TestFieldStartDate = Sys.Date() - 4*7, #"2020-10-23",
  Overlap = 0,
  seed = 101170,
  direct = TRUE,
  repeats = 1000,
  funEvals = 1000,
  funEvalsFactor = 0,
  size = 250,
  simrepeats = 10,
  parallel = TRUE,
  percCores = 0.9,
  icu = TRUE,
  icuWeights = c(1,1),
  verbosity=11,
  testRepeats = 10,
  tryOnTestSet = FALSE
)


Try the babsim.hospital package in your browser

Any scripts or data that you put into this service are public.

babsim.hospital documentation built on May 30, 2022, 9:05 a.m.