knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(dymiumCore)
This manual is for people who would like to put together a microsimulation model using ready-to-use modules from dymium-org/dymiumModules. The most the you are willing to change are entity data and model parameters that go into your microsimulation model. If this is you then read on! :) otherwise please see the manual section and select the appropriate manual for you.
It is recommended that you setup a new RStudio project for your microsimulation project. See this page if you are using RStudio but new to using Projects. If you don't have RStudio please go ahead and download it at (rstudio.com).
Every module in dymium-org/dymiumModules repository relies on the modules package to work properly. Hence, make sure your have the package installed before using any module.
It is important for a microsimulation model to be modular for greater maintainability and extendability. Your can read more about this discussion from R Cassells, A Harding, S Kelly, 2006 and Eric Miller, 2019.
In dymium, a module is basically a group of related events (i.e. ageing, giving birth and dying belong to the demography
module). In fact, when we say dymium is modular we actually refer to the events that are being modular. This will be explained more in the next section.
I recommend that before you download any module you should first see its README page inside the repository. For example, if you want to know about the demography
module or the events it has please see the dymium-org/dymiumModules repository.
Once you are certain which module you want then use download_module()
to download that module into your project. This function allows modules to be downloaded into your project. For example, if you would like to download the demography
module:
download_module("demography")
By default, this downloads the latest version of the demography
module from dymium-org/dymiumModules into a newly created folder called modules
at the root of project. See ?download_module
for more download options.
The term event
is used to refer to a process that is set to occur within a microsimulation model. For example, if your microsimulation model is for simulating the dynamic of population then you would have an ageing event, a birth event, a death event, and (optionally) a migration event. All events should be self-contained, meaning they do not need other events to be present to work. However, some events require other events. As an example, the divorce
event from the demography
module needs the separate
event to work. This is because only people that can be divorced must first be in separation, which is simulated by the separate
event.
Inside the demography module you will find a number of R scripts. helpers.R
, constants.R
, logger.R
, tests
folder are files and folder that got created when a new module is created using use_module()
. To learn more about them see the developer manual. The R scripts other than those are event scripts and they are the ones that you should import to your microsimulation model.
For example, if you would like to use the birth event which can be found at your_project_folder/modules/demography/birth.R
in your microsimulation model do as the following:
event_demography_birth <- modules::use('modules/demography/birth.R')
The above chuck imports the birth event into your active R environment by assigning it to a variable called event_demography_birth
. Any event that is imported using modules
have the following fields exposed: run()
and REQUIRED_MODELS
.
event_demography_birth$run()
The run()
function of any event takes the same four main arguments which are x
, model
, target
, time_step
. Some run()
functions may take more than the four main arguments. Therefore you should always check the README page of the module of the event that you are using.
event_demography_birth$REQUIRED_MODELS
REQUIRED_MODELS
is a field that contains NULL
if no models are required by the event or a character vector. Anytime run()
is called it will check if the supplied model argument or the supplied world has all the required models or not. If not then an error will appear.
At some point, you may find yourself wanting to include some additional variables to a model but those variables don't exist in the attribute data of your agents. Those additional variables maybe variables that get generated during the simulation (such as the length of the current marriage, the age of the youngest child, the number of divorces) or derive from an existing attribute (such as age in 5-year age group etc.). These can be easily included in the Transition
object of your model.
For example, in the birth event of the demography module there is a TransitionBirth
class that is extended from TransitionClassification
.
# See https://github.com/dymium-org/dymiumModules/blob/d53fdb47680efc9a05e56f0c420c85907e73794e/modules/demography/birth.R#L147-L167 TransitionBirth <- R6Class( classname = "TransitionBirth", inherit = dymiumCore::TransitionClassification, public = list( filter = function(.data) { .data %>% helpers$FilterAgent$Ind$can_give_birth(.)# %>% # helpers$FilterAgent$Ind$is_in_relationship(.) }, mutate = function(.data) { Ind <- private$.AgtObj .data %>% helpers$DeriveVar$IND$has_resident_children(x = ., Ind) %>% helpers$DeriveVar$IND$n_resident_children(x = ., Ind) %>% helpers$DeriveVar$IND$age_youngest_resident_child(x = ., Ind) %>% helpers$DeriveVar$IND$age5(x = ., Ind) %>% helpers$DeriveVar$IND$n_children(x = ., Ind) %>% helpers$DeriveVar$IND$mrs(x = ., Ind) } ) )
TransitionBirth
has two implemented methods that TransitionClassification
doesn't have which are filter(.data)
and mutate(.data)
.
The filter
method defines the criteria which the agents must meet to undergo this TransitionBirth
transition, which is to give birth. helpers$FilterAgent$Ind$can_give_birth(x)
, a function defined in helpers.R
, filters only those agents that are women with age between RULES$GIVE_BIRTH$AGE_LOWER_BOUND
and RULES$GIVE_BIRTH$AGE_UPPER_BOUND
. The age rules can be found in constants.R
and they can be changed to suit your assumption.
helpers$FilterAgent$Ind$can_give_birth = function(x) { get_individual_data(x) %>% .[sex == IND$SEX$FEMALE & age %between% c(RULES$GIVE_BIRTH$AGE_LOWER_BOUND, RULES$GIVE_BIRTH$AGE_UPPER_BOUND)] }
While the mutate(.data)
method allows you to add additional variables to be used as predictors in your model. As you can see, there are quite a few additional variables that get added to the agent data from the number of children, the age of the youngest residential child, etc.
By default the filter data from filter()
will be passed to the mutate(.data)
function then to the simulate
function of Transition
. However, this order can be changed in case you need to filter agents based on a derived variable by setting the mutate_first
field equal to TRUE
.
TransitionBirth <- R6Class( classname = "TransitionBirth", inherit = dymiumCore::TransitionClassification, public = list( filter = function(.data) { .data %>% helpers$FilterAgent$Ind$can_give_birth(.)# %>% # helpers$FilterAgent$Ind$is_in_relationship(.) }, mutate = function(.data) { Ind <- private$.AgtObj .data %>% helpers$DeriveVar$IND$has_resident_children(x = ., Ind) %>% helpers$DeriveVar$IND$n_resident_children(x = ., Ind) %>% helpers$DeriveVar$IND$age_youngest_resident_child(x = ., Ind) %>% helpers$DeriveVar$IND$age5(x = ., Ind) %>% helpers$DeriveVar$IND$n_children(x = ., Ind) %>% helpers$DeriveVar$IND$mrs(x = ., Ind) }, mutate_first = TRUE # mutate before filter !!! ) )
Putting together a microsimulation model using dymium
is very simple. Once you have your world
object constructed with all the necessary entities and models for your incluced event functions you just need to create a flow-control statement such as a for-loop which exits at some point.
In the example below, the simulation will be stopped after the 10th iteration. Note that, world
has a $start_iter()
method which sets the simulation clock to the time_step in i
before returning itself down the pipeline. The time from the simulation clock is used by many functions such as add_history
, Generic$log()
and is_scheduled
. Therefore, we recommend that you use the $start_iter()
method to pass the world object down the pipeline in your microsimulation model setup.
for (i in 1:10) { world$start_iter(time_step = i, unit = "year") %>% event_1$run(.) %>% event_2$run(.) }
Microsimulation models rely randomly number generators to produce their simulation results, the results between different runs are very unlikely to be identical. Hence, to validate the model or to conduct a sensitivity analysis it is a good idea to run the same simulation setup multiple times and analyse the variance of the results.
To do that here is a simple example using the future
and furrr
packages.
library(future) library(furrr) cl <- makeClusterPSOCK(workers = 2) future::plan(cluster, workers = cl) res <- furrr::future_map(1:4, ~ { world <- readRDS("path/to/world.rds") for (i in 1:10) { world$start_iter(time_step = i, unit = "year") %>% event_1$run(.) %>% event_2$run(.) } return(world) }) parallel::stopCluster(cl)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.