knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(openMalariaUtilities)
This document is still work in progress
OpenMalaria is a powerful tool for simulating and studying the epidemiology of malaria. This R package allows the use of OpenMalaria from R.
Make sure that SQLite
and OpenMalaria
are both installed and available on
the system. Please note: Even though the package should work on Windows, it is
developed and tested under Linux.
To install the package, run
devtools::install_github("SwissTPH/r-openMalariaUtilities")
Within this tutorial, we will create and run example scenarios. Each scenario needs to be defined in a XML file which serves as an input for OpenMalaria (OM). OpenMalariaUtilities (OMU) contains many functions which help to generate these input files. In this tutorial we will re-create the example from OpenMalaria itself.
A large part of OMU is dedicated at the generation of valid OM XML files. Before we start with the generation of these files, we will explain the folder layout of your projects.
In the folder hierarchy, your working or project directory is the top level. Within this folder, you can have one or more folders which correspond to the names of experiments. The simulations of these experiments are controlled via one or multiple R scripts, placed in the project directory. Please note: Most of the directories can be modified via function arguments, but for the sake of simplicity we using the defaults here.
MyProject ├── experiment_1 ├── experiment_2 ├── ... ├── script_1.R ├── script_2.R └── ...
Each of the experiment folders will have a similar layout and content.
MyProject/experiment_1 ├── cache/ ├── logs/ ├── outputs/ ├── scenarios/ ├── autoRegressionParameters.csv ├── densities.csv ├── base.xml └── scenario_44.xsd
You do not need to worry about the creation of the folders and files. OMU will create necessary folders and download required files for you.
The cache
folder contains R objects which are used by OMU to store reusable
information. This will speed up certain operations. Log files are stored in the
logs
folder (which will contain sub-folders for each step in the process) and
OpenMalaria's raw output will be outputs
. The XML files for each simulation
scenario are stored in the scenarios
folder.
In your R script, you will probably start with library(openMalariaUtilities)
.
Thus, the beginning could look like
library(openMalariaUtilities) ## Basic skeleton baseList <- list( ## Mandatory expName = "example", ## Mandatory OMVersion = 44L, ## Optional ## analysisNo = 1L, ## Mandatory demography = list(), monitoring = list(), interventions = list(), healthSystem = list(), entomology = list(), ## These are optional for OM ## parasiteGenetics = list(), ## pharmacology = list(), ## diagnostics = list(), model = list() )
For XML file generation, OMU uses a list as input (here: baseList
). This list
has a very similar structure
as the XML file required by OM but is easier to edit from R compared
to XML. The above snippet corresponds to the minimal skeleton of this input
list. expName
is the name of your experiment and OMVersion
the version of
OpenMalaria. Currently, only version 44 is supported. Also note, that the OM
version needs to be an integer.
Each of the six nested lists demography
, monitoring
, interventions
,
healthSystem
, entomology
and model
need to be filled for OM.
There are two ways to add the data to the demography
entry:
baseList[["demography"]] <- list(name = "Ifakara",
...)
defineDemography()
functionWe will use the second option in this tutorial. You can find various defineXY
functions which are each designed to add a specific entry to the list. However,
you always have the option to write parts or change specific entries manually
(See also extractList()
).
demo_data <- data.frame( poppercent = c( 3.474714994, 12.76004028, 14.52151394, 12.75565434, 10.83632374, 8.393312454, 7.001421452, 5.800587654, 5.102136612, 4.182561874, 3.339409351, 2.986112356, 2.555766582, 2.332763433, 1.77400255, 1.008525491, 0.74167341, 0.271863401, 0.161614642 ), upperbound = c( 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90 ) ) baseList <- defineDemography( baseList, name = "Ifakara", popSize = 4000L, maximumAgeYrs = 90, lowerbound = 0, poppercent = demo_data$poppercent, upperbound = demo_data$upperbound ) str(baseList, max.level = 2)
In the above snippet, we started by creating a data frame which contains the
demographic data. This in turn was used as input for defineDemography()
among
other parameters. The function will take care that the information is added at
the correct position and also check, that the correct data types are used.
The monitoring
section controls the output of OpenMalaria, which is generated
during the conducted
surveys.
baseList[["monitoring"]] <- list( name = "Quarterly Surveys", ## Mandatory, different from OM schema startDate = "2000-01-01", continuous = monitoringContinuousGen( period = 1, options = list( name = c("input EIR", "simulated EIR", "human infectiousness", "N_v0", "immunity h", "immunity Y", "new infections", "num transmitting humans", "ITN coverage", "GVI coverage", "alpha", "P_B", "P_C*P_D"), value = c("true", "true", "true", "true", "true", "true", "true", "true", "true", "true", "true", "false", "false") ) ), SurveyOptions = monitoringSurveyOptionsGen( options = list( name = c("nHost", "nPatent", "nUncomp", "nSevere", "nDirDeaths", "inputEIR", "simulatedEIR"), value = c("true", "true", "true", "true", "false", "true", "true") ) ) )
We started the monitoring definition by giving it a name and defining a start date. Here, this is mandatory because it allows us to use dates as input for OpenMalaria. It defines the starting point of the survey period.
For the options, two generator functions are used which take the name of the option and its value as input.
baseList[["monitoring"]] <- c( baseList[["monitoring"]], list( surveys = monitoringSurveyTimesGen( detectionLimit = 40, startDate = "2000-01-01", endDate = "2020-01-01", interval = "quarterly", simStart = "1980-01-01" ), ageGroup = surveyAgeGroupsGen( lowerbound = 0, upperbounds = c(5, 90) ) ) ) str(baseList[["monitoring"]], max.level = 1)
Similar to the age groups in the demography section, surveyAgeGroupsGen()
adds
the desired age groups for the surveys.
The definition of the survey times itself is done above. Please note, that you
really should use monitoringSurveyTimesGen()
as this function will make sure
your desired survey points are valid (and adjust them if necessary) and will
store them for later use in the cache.
We can inspect the cache to see if our survey times have been generated correctly.
listCache()
We can notice two things:
OMU adds one more year (here: 4 surveys) to the survey period. This is done in order to ensure that all deployments show their effect. The dates are adjusted so they match OM's internal time steps. For more details on this, see here.
Furthermore, the number of days or time steps is counted from the simulation
start simStart
, which is here the 1st January 1980. It is recommended to start
the simulation 20 - 50 years before the first survey.
In this section we will define actual measures which will be used throughout the simulation.
baseList[["interventions"]] <- list( name = "test", human = list( component = list( id = "GVI", name = "DDT test", GVI = list( decay = list( L = "0.5", "function" = "exponential" ), anophelesParams = list( mosquito = "gambiae_ss", propActive = "1", deterrency = list(value = "0.56"), preprandialKillingEffect = list(value = "0"), postprandialKillingEffect = list(value = "0.24") ) ) ), deployment = list( name = "DDT test", component = list(id = "GVI"), timed = list( deploy = list(coverage = "0.9", time = "10y") ) ) ) )
baseList[["healthSystem"]] <- list( ImmediateOutcomes = list( name = "Tanzania ACT", drugRegimen = list( firstLine = "ACT", inpatient = "QN", secondLine = "ACT" ), initialACR = list( ACT = list(value = 0.85), QN = list(value = 0.998), selfTreatment = list(value = 0.63) ), compliance = list( ACT = list(value = 0.9), selfTreatment = list(value = 0.85) ), nonCompliersEffective = list( ACT = list(value = 0), selfTreatment = list(value = 0) ), treatmentActions = list( ACT = list( name = "clear blood-stage infections", clearInfections = list( stage = "blood", timesteps = "1" ) ), QN = list( name = "clear blood-stage infections", clearInfections = list( stage = "blood", timesteps = "1" ) ) ), pSeekOfficialCareUncomplicated1 = list(value = 0.04), pSelfTreatUncomplicated = list(value = 0.01), pSeekOfficialCareUncomplicated2 = list(value = 0.04), pSeekOfficialCareSevere = list(value = 0.48) ), CFR = list( group = list(lowerbound = 0, value = 0.09189), group = list(lowerbound = 0.25, value = 0.0810811), group = list(lowerbound = 0.75, value = 0.0648649), group = list(lowerbound = 1.5, value = 0.0689189), group = list(lowerbound = 2.5, value = 0.0675676), group = list(lowerbound = 3.5, value = 0.0297297), group = list(lowerbound = 4.5, value = 0.0459459), group = list(lowerbound = 7.5, value = 0.0945946), group = list(lowerbound = 12.5, value = 0.1243243), group = list(lowerbound = 15, value = 0.1378378) ), ## Mandatory pSequelaeInpatient = list( interpolation = "none", group = list(lowerbound = "0.0", value = 0.0132), group = list(lowerbound = "5.0", value = 0.005) ) )
baseList[["entomology"]] <- list( mode = "dynamic", name = "Namawala", vector = list( anopheles = list( mosquito = "gambiae_ss", propInfected = 0.078, propInfectious = "0.021", seasonality = list( annualEIR = "16", input = "EIR", fourierSeries = list( EIRRotateAngle = "0", coeffic = list( a = "0.8968", b = "2.678" ), coeffic = list(a = "-0.4551", b = "2.599") ) ), mosq = list( minInfectedThreshold = "0.001", mosqRestDuration = list(value = "3"), extrinsicIncubationPeriod = list(value = "11"), mosqLaidEggsSameDayProportion = list(value = "0.313"), mosqSeekingDuration = list(value = "0.33"), mosqSurvivalFeedingCycleProbability = list(value = "0.623"), availability = list(distr = "const"), mosqProbBiting = list(mean = "0.95", variance = "0"), mosqProbFindRestSite = list(mean = "0.95", variance = "0"), mosqProbResting = list(mean = "0.99", variance = "0"), mosqProbOvipositing = list(value = "0.88"), mosqHumanBloodIndex = list(value = "0.939") ), nonHumanHosts = list( name = "unprotectedAnimals", mosqRelativeEntoAvailability = list(value = "1.0"), mosqProbBiting = list(value = "0.95"), mosqProbFindRestSite = list(value = "0.95"), mosqProbResting = list(value = "0.99") ) ), nonHumanHosts = list(name = "unprotectedAnimals", number = "1.0") ) )
baseList[["model"]] <- list( ModelOptions = list( option = list(name = "LOGNORMAL_MASS_ACTION", value = "true"), option = list(name = "NO_PRE_ERYTHROCYTIC", value = "true"), option = list(name = "INNATE_MAX_DENS", value = "false"), option = list(name = "INDIRECT_MORTALITY_FIX", value = "false"), option = list(name = "NON_MALARIA_FEVERS", value = "true") ), clinical = list( healthSystemMemory = "6", NonMalariaFevers = list( incidence = list( group = list(lowerbound = "0", value = "0.322769924518357"), group = list(lowerbound = "0", value = "0.308520194304172"), group = list(lowerbound = "1", value = "0.279441774808493"), group = list(lowerbound = "2", value = "0.250431781111273"), group = list(lowerbound = "3", value = "0.223285859756841"), group = list(lowerbound = "4", value = "0.199298352451799"), group = list(lowerbound = "5", value = "0.179376872365614"), group = list(lowerbound = "6", value = "0.163623659390782"), group = list(lowerbound = "7", value = "0.152227726923469"), group = list(lowerbound = "8", value = "0.145022785567758"), group = list(lowerbound = "9", value = "0.141493087461765"), group = list(lowerbound = "10", value = "0.140473293219353"), group = list(lowerbound = "11", value = "0.141109775159515"), group = list(lowerbound = "12", value = "0.142644475217328"), group = list(lowerbound = "13", value = "0.144335079395766"), group = list(lowerbound = "14", value = "0.145964032924869"), group = list(lowerbound = "15", value = "0.147708915135714"), group = list(lowerbound = "16", value = "0.149731543445568"), group = list(lowerbound = "17", value = "0.151887428568276"), group = list(lowerbound = "18", value = "0.154060663485195"), group = list(lowerbound = "19", value = "0.156179169710494"), group = list(lowerbound = "20", value = "0.158135015380583"), group = list(lowerbound = "21", value = "0.159704766482219"), group = list(lowerbound = "22", value = "0.160807788387655"), group = list(lowerbound = "23", value = "0.161427976448279"), group = list(lowerbound = "24", value = "0.161620429119137"), group = list(lowerbound = "25", value = "0.16144021875986"), group = list(lowerbound = "26", value = "0.160943264630612"), group = list(lowerbound = "27", value = "0.160217573697398"), group = list(lowerbound = "28", value = "0.159422614374451"), group = list(lowerbound = "29", value = "0.158542519631641"), group = list(lowerbound = "30", value = "0.157501217628248"), group = list(lowerbound = "31", value = "0.156175160594841"), group = list(lowerbound = "32", value = "0.154402302191411"), group = list(lowerbound = "33", value = "0.152102040636481"), group = list(lowerbound = "34", value = "0.14921450014676"), group = list(lowerbound = "35", value = "0.145714433541659"), group = list(lowerbound = "36", value = "0.141800502067518"), group = list(lowerbound = "37", value = "0.137916853907569"), group = list(lowerbound = "38", value = "0.134503529382102"), group = list(lowerbound = "39", value = "0.131746276580642"), group = list(lowerbound = "40", value = "0.12969902537497"), group = list(lowerbound = "41", value = "0.128398077347679"), group = list(lowerbound = "42", value = "0.127864136551891"), group = list(lowerbound = "43", value = "0.12804497197004"), group = list(lowerbound = "44", value = "0.128894055047661"), group = list(lowerbound = "45", value = "0.130350838992718"), group = list(lowerbound = "46", value = "0.132286605622701"), group = list(lowerbound = "47", value = "0.134599921072495"), group = list(lowerbound = "48", value = "0.137212726976988"), group = list(lowerbound = "49", value = "0.140035253913284"), group = list(lowerbound = "50", value = "0.142934573453621"), group = list(lowerbound = "51", value = "0.145830221511879"), group = list(lowerbound = "52", value = "0.148674810561069"), group = list(lowerbound = "53", value = "0.151497963594518"), group = list(lowerbound = "54", value = "0.15438856687865"), group = list(lowerbound = "55", value = "0.157403790093505"), group = list(lowerbound = "56", value = "0.16059513222516"), group = list(lowerbound = "57", value = "0.16402433342886"), group = list(lowerbound = "58", value = "0.16770481415944"), group = list(lowerbound = "59", value = "0.171626873047865"), group = list(lowerbound = "60", value = "0.175748327054247"), group = list(lowerbound = "61", value = "0.180030857856799"), group = list(lowerbound = "62", value = "0.184411365583771"), group = list(lowerbound = "63", value = "0.188816421789366"), group = list(lowerbound = "64", value = "0.19316997803338"), group = list(lowerbound = "65", value = "0.197435603275487"), group = list(lowerbound = "66", value = "0.201578808813379"), group = list(lowerbound = "67", value = "0.205556806881398"), group = list(lowerbound = "68", value = "0.209307183457343"), group = list(lowerbound = "69", value = "0.212783260344084"), group = list(lowerbound = "70", value = "0.215944154621391"), group = list(lowerbound = "71", value = "0.218749275266548"), group = list(lowerbound = "72", value = "0.221187990639016"), group = list(lowerbound = "73", value = "0.223361260399378"), group = list(lowerbound = "74", value = "0.225363436789592"), group = list(lowerbound = "75", value = "0.227254280093211"), group = list(lowerbound = "76", value = "0.229084576349576"), group = list(lowerbound = "77", value = "0.230891971097789"), group = list(lowerbound = "78", value = "0.232690225166173"), group = list(lowerbound = "79", value = "0.234484973338876"), group = list(lowerbound = "80", value = "0.236276361586796"), group = list(lowerbound = "81", value = "0.238064394629696"), group = list(lowerbound = "82", value = "0.239849077182917"), group = list(lowerbound = "83", value = "0.241630413957381"), group = list(lowerbound = "84", value = "0.243408409659591"), group = list(lowerbound = "85", value = "0.245183068991633"), group = list(lowerbound = "86", value = "0.246954396651183"), group = list(lowerbound = "87", value = "0.248722397331501"), group = list(lowerbound = "88", value = "0.250487075721441"), group = list(lowerbound = "89", value = "0.252248436505447"), group = list(lowerbound = "90", value = "0.253127874257909") ) ) ), human = list( availabilityToMosquitoes = list( group = list(lowerbound = "0", value = "0.7076"), group = list(lowerbound = "0", value = "0.8538"), group = list(lowerbound = "5", value = "1.0"), group = list(lowerbound = "5", value = "1.0") ) ), parameters = list( interval = "5", iseed = "1", latentp = "3", parameter = list(include = "false", name = "'-ln(1-Sinf)'", number = "1", value = "0.050736"), parameter = list(include = "false", name = "Estar", number = "2", value = "0.03247"), parameter = list(include = "false", name = "Simm", number = "3", value = "0.138161050830301"), parameter = list(include = "false", name = "Xstar_p", number = "4", value = "1514.385853233699891"), parameter = list(include = "false", name = "gamma_p", number = "5", value = "2.03692533424484"), parameter = list(include = "false", name = "sigma2i", number = "6", value = "10.173598698525799"), parameter = list(include = "false", name = "CumulativeYstar", number = "7", value = "35158523.31132510304451"), parameter = list(include = "false", name = "CumulativeHstar", number = "8", value = "97.334652723897705"), parameter = list(include = "false", name = "'-ln(1-alpha_m)'", number = "9", value = "2.33031045876193"), parameter = list(include = "false", name = "decay_m", number = "10", value = "2.53106547375805"), parameter = list(include = "false", name = "sigma2_0", number = "11", value = "0.655747311168152"), parameter = list(include = "false", name = "Xstar_v", number = "12", value = "0.916181104713054"), parameter = list(include = "false", name = "Ystar2", number = "13", value = "6502.26335600001039"), parameter = list(include = "false", name = "alpha", number = "14", value = "142601.912520000012591"), parameter = list(include = "false", name = "Density bias (non Garki)", number = "15", value = "0.177378570987455"), parameter = list(include = "false", name = " sigma2 ", number = "16", value = "0.05"), parameter = list(include = "false", name = "log oddsr CF community", number = "17", value = "0.736202"), parameter = list(include = "false", name = "Indirect risk cofactor", number = "18", value = "0.018777338"), parameter = list(include = "false", name = "Non-malaria infant mortality", number = "19", value = "49.539046599999999"), parameter = list(include = "false", name = "Density bias (Garki)", number = "20", value = "4.79610772546704"), parameter = list(include = "false", name = "Severe Malaria Threshhold", number = "21", value = "784455.599999999976717"), parameter = list(include = "false", name = "Immunity Penalty", number = "22", value = "1"), parameter = list(include = "false", name = "Immune effector decay", number = "23", value = "0"), parameter = list(include = "false", name = "comorbidity intercept", number = "24", value = "0.0968"), parameter = list(include = "false", name = "Ystar half life", number = "25", value = "0.275437402"), parameter = list(include = "false", name = "Ystar1", number = "26", value = "0.596539864"), parameter = list(include = "false", name = "Asexual immunity decay", number = "27", value = "0"), parameter = list(include = "false", name = "Ystar0", number = "28", value = "296.302437899999973"), parameter = list(include = "false", name = "Idete multiplier", number = "29", value = "2.797523626"), parameter = list(include = "false", name = "critical age for comorbidity", number = "30", value = "0.117383") ) )
## Setup dirs setupDirs(experimentName = "exp_test", replace = TRUE) ## Copy necessary OpenMalaria files. setupOM()
setupDirs()
creates the experiment folder given as an argument in the project
directory. This also includes all the sub-folders which were mentioned in the
introduction .
We can verify this by checking the layout of the experiment directory.
list.dirs("exp_test")
setupOM()
will make sure that all required files for OpenMalaria to run are
in place. This includes the correct .xsd
schema file (according to the
specified OM version) as well as further supplementary files.
list.files("exp_test")
As the name implies, createBaseXml()
will translate our input list to a XML
document and write it to disk.
## Create base XML file createBaseXml(baseList, replace = TRUE) readLines(file("exp_test/exp_test_base.xml", "r"), n = 10)
In order to run the simulation, we can simply call runSimulations()
.
## Run simulations runSimulations() #> [1] "Running scenario [1/1]"
By default, the output in the R console is fairly minimal. More detailed logs
can be found in the logs/simulation
folder.
list.files("exp_test/logs/simulation") #> [1] "exp_test_base.log"
OpenMalaria generates two output
files
which are both placed in the outputs
folder:
*_cts.txt
fileread.table("exp_test/outputs/exp_test_base_cts.txt", header = TRUE, sep = "\t", skip = 1, nrows = 10) #> timestep input.EIR simulated.EIR human.infectiousness N_v0.gambiae_ss. #> 1 0 0.00995606 0.00995049 0.0179202 30147.4 #> 2 1 0.01958060 0.15390600 0.0175667 39181.0 #> 3 2 0.03850920 0.24658900 0.0171114 46368.5 #> 4 3 0.07460740 0.37191900 0.0167741 49838.5 #> 5 4 0.14030700 0.52390000 0.0164523 48641.0 #> 6 5 0.25256000 0.68747000 0.0165751 43197.7 #> 7 6 0.42953400 0.84148200 0.0188351 35064.0 #> 8 7 0.68226100 0.96738200 0.0231314 26185.7 #> 9 8 1.00214000 1.07333000 0.0296558 18146.4 #> 10 9 1.35047000 1.18202000 0.0377866 11790.1 #> immunity.h immunity.Y new.infections num.transmitting.humans ITN.coverage #> 1 256.846 11899700 0 2613 0 #> 2 256.817 11900500 400 2590 0 #> 3 256.856 11895200 692 2572 0 #> 4 256.934 11896300 1054 2549 0 #> 5 257.109 11897600 1469 2526 0 #> 6 257.299 11899200 1795 2507 0 #> 7 257.791 11901800 2250 2636 0 #> 8 258.367 11903800 2643 2798 0 #> 9 259.013 11904600 2987 3019 0 #> 10 259.681 11909900 3256 3280 0 #> GVI.coverage alpha_i.gambiae_ss. #> 1 0 0.000206347 #> 2 0 0.000206342 #> 3 0 0.000206349 #> 4 0 0.000206332 #> 5 0 0.000206344 #> 6 0 0.000206344 #> 7 0 0.000206347 #> 8 0 0.000206338 #> 9 0 0.000206330 #> 10 0 0.000206329
*_out.txt
fileread.table("exp_test/outputs/exp_test_base_out.txt", header = FALSE, sep = "\t", nrows = 10) #> V1 V2 V3 V4 #> 1 1 1 0 6.07000e+02 #> 2 1 2 0 3.39300e+03 #> 3 1 1 3 1.37000e+02 #> 4 1 2 3 1.15500e+03 #> 5 1 1 14 3.35770e+04 #> 6 1 2 14 2.94240e+04 #> 7 1 1 15 7.11000e+02 #> 8 1 2 15 4.80000e+01 #> 9 1 0 35 2.19321e-01 #> 10 1 0 36 2.87284e-01
The *_out.txt
file is usually the file you are interested in, as it contains
the measurement data for each survey you defined in the input list. See
here for more
information.
OpenMalaria's output files use numerical indiccators in the second column which can make the interpretation of the values difficult for new users. Thus OMU provides a helper function which reads an output file and assigns column names and translates certain values (e.g. the survey dates).
outfile <- readOutputFile("exp_test/outputs/exp_test_base_out.txt") head(outfile) #> survey_date third_dimension measure value #> 1: 1999-12-27 1 nHost 607 #> 2: 1999-12-27 2 nHost 3393 #> 3: 1999-12-27 1 nPatent 137 #> 4: 1999-12-27 2 nPatent 1155 #> 5: 1999-12-27 1 nUncomp 33577 #> 6: 1999-12-27 2 nUncomp 29424
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.