knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) source(system.file("extdata", "vignette-helpers.R", package = "ospsuite.reportingengine"))
library(ospsuite.reportingengine)
In Population workflows, the Demography task (plotDemography
) aims at reporting the distributions of requested Demography parameters that will account for the population workflow type (as defined by variable PopulationWorkflowTypes
).
The distributions are reported graphically using histograms or range plots depending on the workflow settings.
Observed data can be included in the demography plots for comparison.
As illustrated below, DataSource
objects are necessary to include observed data in demography plots.
Code
# Define file paths for pkml simulation and populations dictionary <- system.file("extdata", "popDictionary.csv", package = "ospsuite.reportingengine" ) dataFile <- system.file("extdata", "obsPop.csv", package = "ospsuite.reportingengine" ) dataSource <- DataSource$new( dataFile = dataFile, metaDataFile = dictionary, caption = "Demography vignette example" )
In this process, the dictionary input as metaDataFile
is essential.
Three columns of the dictionary are used by the workflow in order to map and convert the demography parameters.
The 2 next table shows the dictionary and the first rows of the observed data used in our example.
Dictionary
knitr::kable(readObservedDataFile(dictionary))
Observed Data
knitr::kable(head(readObservedDataFile(dataFile)))
Tip: to ensure that the unit defined in datasetUnit is appropriate, you can leverage the {ospsuite}
package and copy/paste the value from R. Below provides a sample to get the appropriate unit for BMI (you need to use format = 13
in order to keep correct encoded text as done for the square below).
writeClipboard(ospsuite::ospUnits$BMI$`kg/m²`, format = 13)
The code below creates the simulation sets used by the workflow defining the simulation, population and data files that will be used through this vignette.
In this example, the same observed data are included using the dataSource
argument and different data are selected using the dataSelection
argument of the PopulationSimulationSet
.
Code
# Define file paths for pkml simulation and populations simulationFile <- system.file("extdata", "Aciclovir.pkml", package = "ospsuite") adultFile <- system.file("extdata", "adults.csv", package = "ospsuite.reportingengine") childrenFile <- system.file("extdata", "children.csv", package = "ospsuite.reportingengine") adultSet <- PopulationSimulationSet$new( referencePopulation = TRUE, simulationSetName = "Adults", simulationFile = simulationFile, populationFile = adultFile, dataSource = dataSource, dataSelection = 'POP %in% "Adults"' ) childrenSet <- PopulationSimulationSet$new( simulationSetName = "Children", simulationFile = simulationFile, populationFile = childrenFile, dataSource = dataSource, dataSelection = 'POP %in% "Children"' )
This section describes how to proceed with Parallel and Ratio Comparison workflows.
When creating a Parallel or a Ratio Comparison workflow, the plotDemography
task includes 2 fields named xParameters
and yParameters
.
xParameters
defines the demography parameters displayed in the x-axis of range plots.xParameters
is defined (xParameters = NULL
), histograms of demography parameters defined in yParameters
are performed. yParameters
defines the demography parameters that will be displayed in the histograms if there is no xParameters
or in the y-axis of range plots if xParameters
defines demography parameters.Note that the parameters can be defined as a vector of character values but also as a named list in which case the name will be used as display name in the figures.
By default, xParameters = NULL
meaning that histogram are displayed.
Users can check the default xParameters
of their workflow type using the function getDefaultDemographyXParameters()
.
Default yParameters
are independent of the workflow type and listed in the table below.
knitr::kable(data.frame( "Parameter Path" = as.character(ospsuite.reportingengine:::DemographyDefaultParameters), "Display Name" = names(ospsuite.reportingengine:::DemographyDefaultParameters), check.names = FALSE ))
To define different demography parameters in xParameters
and yParameters
, the functions setXParametersForDemographyPlot()
and setYParametersForDemographyPlot()
can be respectively used.
The code below initializes the parallel comparison workflow and activate only the demography task.
Code
parallelWorkflow <- PopulationWorkflow$new( workflowType = PopulationWorkflowTypes$parallelComparison, simulationSets = c(adultSet, childrenSet), workflowFolder = "Parallel-Histograms", createWordReport = FALSE ) inactivateWorkflowTasks(parallelWorkflow) activateWorkflowTasks(parallelWorkflow, tasks = AllAvailableTasks$plotDemography)
To plot histograms of demography parameters, no demography parameters are defined in xParameters
as illustrated in the example below which defines the parameters to be plotted in the histograms.
To define parameters to be included in the report, either a character array or a named list can be provided.
In the case of a named list, the name will be displayed while using its argument to query the appropriate demography parameter.
In this example, the parameter Organism|Age
is defined with Age
as displayed name.
User can also leverage the {ospsuite}
package and look for the parameters in the enum StandardPath
.
Code
# Query parameters defined in xParameters getXParametersForDemographyPlot(workflow = parallelWorkflow) # Query parameters defined in yParameters getYParametersForDemographyPlot(workflow = parallelWorkflow)
displayedParameters <- list( Age = "Organism|Age", Weight = "Organism|Weight", Gender = "Gender" ) # Define no parameters in xParameters to get histograms setXParametersForDemographyPlot(workflow = parallelWorkflow, parameters = NULL) # Define parameters to be displayed in histograms as yParameters setYParametersForDemographyPlot(workflow = parallelWorkflow, parameters = displayedParameters)
Run the corresponding workflow:
parallelWorkflow$runWorkflow()
For Parallel and Ratio Comparison workflows, the demography histograms display the distributions for each simulation set along with its observed data if available (#17, #535).
For categorical parameters, it can be noted that R factor levels are leveraged to display the name of the categories as is. Thus, users need to ensure that the same names are used in both their population files and observed data.
cat(includeReportFromWorkflow(parallelWorkflow))
# Remove the workflow folders unlink(parallelWorkflow$workflowFolder, recursive = TRUE)
The code below initializes the parallel comparison workflow and activate only the demography task.
Code
parallelWorkflow <- PopulationWorkflow$new( workflowType = PopulationWorkflowTypes$parallelComparison, simulationSets = c(adultSet, childrenSet), workflowFolder = "Parallel-Range-Plots", createWordReport = FALSE ) inactivateWorkflowTasks(parallelWorkflow) activateWorkflowTasks(parallelWorkflow, tasks = AllAvailableTasks$plotDemography)
To display range plots of demography parameters, the demography parameters to display are defined in xParameters
as illustrated in the example.
Note that categorical parameters defined as xParameters
(Gender in this case) are displayed (#1088) using box-whisker plots.
Code
xParameters <- list( Age = "Organism|Age", Gender = "Gender" ) setXParametersForDemographyPlot(workflow = parallelWorkflow, parameters = xParameters) setYParametersForDemographyPlot(workflow = parallelWorkflow, parameters = displayedParameters)
Run the corresponding workflow:
parallelWorkflow$runWorkflow()
For Parallel and Ratio Comparison workflows, the demography range plots display the distributions for each simulation set along with its observed data if available (#17, #535).
Likewise, categorical parameters (Gender in this case) are displayed along with its observed data if available using boxplots instead (#1088.
For the comparison between simulated and observed populations to be the most relevant possible, the same binning is applied between simulated and observed populations.
cat(includeReportFromWorkflow(parallelWorkflow))
# Remove the workflow folders unlink(parallelWorkflow$workflowFolder, recursive = TRUE)
This section describes how to proceed with Pediatric workflows.
When creating a Pediatric workflow, the plotDemography
task includes 2 fields named xParameters
and yParameters
.
xParameters
defines the demography parameters displayed in the x-axis of range plots.xParameters
is defined (xParameters = NULL
), histograms of demography parameters defined in yParameters
are performed. yParameters
defines the demography parameters that will be displayed in the histograms if there is no xParameters
or in the y-axis of range plots if xParameters
defines demography parameters.Note that the parameters can be defined as a vector of character values but also as a named list in which case the name will be used as display name in the figures.
By default, xParameters = "Organism|Age"
meaning that range plots of demography parameters vs Age are displayed.
Users can check the default xParameters
of their workflow type using the function getDefaultDemographyXParameters()
.
Default yParameters
are independent of the workflow type and listed in the table below.
knitr::kable(data.frame( "Parameter Path" = as.character(ospsuite.reportingengine:::DemographyDefaultParameters), "Display Name" = names(ospsuite.reportingengine:::DemographyDefaultParameters), check.names = FALSE ))
To define different demography parameters in xParameters
and yParameters
, the functions setXParametersForDemographyPlot()
and setYParametersForDemographyPlot()
can be respectively used.
The code below initializes the pediatric workflow and activate only the demography task.
Code
pediatricWorkflow <- PopulationWorkflow$new( workflowType = PopulationWorkflowTypes$pediatric, simulationSets = c(adultSet, childrenSet), workflowFolder = "Pediatric-Histograms", createWordReport = FALSE ) inactivateWorkflowTasks(pediatricWorkflow) activateWorkflowTasks(pediatricWorkflow, tasks = AllAvailableTasks$plotDemography)
To plot histograms of demography parameters, no demography parameters are defined in xParameters
as illustrated in the example below which defines the parameters to be plotted in the histograms.
Code
displayedParameters <- list( Age = "Organism|Age", Weight = "Organism|Weight", Gender = "Gender" ) setXParametersForDemographyPlot(workflow = pediatricWorkflow, parameters = NULL) setYParametersForDemographyPlot(workflow = pediatricWorkflow, parameters = displayedParameters)
Run the corresponding workflow:
pediatricWorkflow$runWorkflow()
For Pediatric workflows, the demography histograms display the distributions for all simulation sets together (#17, #535). If observed data is available, the distributions for all simulation sets observed data are also displayed together.
For categorical parameters, it can be noted that R factor levels are leveraged to display the name of the categories as is. Thus, users need to ensure that the same names are used in both their population files and observed data.
cat(includeReportFromWorkflow(pediatricWorkflow))
# Remove the workflow folders unlink(pediatricWorkflow$workflowFolder, recursive = TRUE)
The code below initializes the pediatric workflow and activate only the demography task.
Code
pediatricWorkflow <- PopulationWorkflow$new( workflowType = PopulationWorkflowTypes$pediatric, simulationSets = c(adultSet, childrenSet), workflowFolder = "Pediatric-Range-Plots", createWordReport = FALSE ) inactivateWorkflowTasks(pediatricWorkflow) activateWorkflowTasks(pediatricWorkflow, tasks = AllAvailableTasks$plotDemography)
To display range plots of demography parameters, the demography parameters to display are defined in xParameters
as illustrated in the example.
Note that categorical parameters defined as xParameters
(Gender in this case) are displayed (#1088) using box-whisker plots.
Code
xParameters <- list( Age = "Organism|Age", Gender = "Gender" ) setXParametersForDemographyPlot(workflow = pediatricWorkflow, parameters = xParameters) setYParametersForDemographyPlot(workflow = pediatricWorkflow, parameters = displayedParameters)
Run the corresponding workflow:
pediatricWorkflow$runWorkflow()
For Pediatric workflows, the demography range plots display the distributions for each simulation set along with its observed data if available (#17, #535). Additionally, range plots comparing the distribution of each simulation set to the reference simulation set is also performed.
Categorical parameters (Gender in this case) are also displayed along with the reference simulation set.
For the comparison between simulated and observed populations to be the most relevant possible, the same binning is used between simulated and observed populations.
For the additional comparison with the reference simulation set, the global summary statistics of the reference are displayed as horizontal ranges.
However, it is possible to display binned ranges for the reference simulation set by turning off the settings referenceGlobalRange
.
cat(includeReportFromWorkflow(pediatricWorkflow))
# Remove the workflow folders unlink(pediatricWorkflow$workflowFolder, recursive = TRUE)
The demography plot task includes a few settings that may allow users to fine tune their demography plots.
Such settings are available in the field workflow$plotDemography$settings
.
Since categorical parameters are displayed using boxplots, they are not affected by the settings described below.
Currently, the default number of bins is set to r ospsuite.reportingengine:::AggregationConfiguration$bins
.
However, users can update the number of bins or even include the bin edges within the field bins
as illustrated below.
workflow$plotDemography$settings$bins <- 5
In the example below, the binning is included within a parallel workflow:
Code
parallelWorkflow <- PopulationWorkflow$new( workflowType = PopulationWorkflowTypes$parallelComparison, simulationSets = c(adultSet, childrenSet), workflowFolder = "Parallel-Histograms", createWordReport = FALSE ) inactivateWorkflowTasks(parallelWorkflow) activateWorkflowTasks(parallelWorkflow, tasks = AllAvailableTasks$plotDemography) # Only display histograms for Age, Weight and Gender displayedParameters <- list( Age = "Organism|Age", Weight = "Organism|Weight", Gender = "Gender" ) setYParametersForDemographyPlot(workflow = parallelWorkflow, parameters = displayedParameters) # Set up 5 bins for all histograms parallelWorkflow$plotDemography$settings$bins <- 5 parallelWorkflow$runWorkflow()
In the report displayed below, the histograms are displayed with 5 bins except for categorical parameter whose categories are displayed as is.
cat(includeReportFromWorkflow(parallelWorkflow))
# Remove the workflow folders unlink(parallelWorkflow$workflowFolder, recursive = TRUE)
The same binning can also be applied to range plots as illustrated below:
Code
pediatricWorkflow <- PopulationWorkflow$new( workflowType = PopulationWorkflowTypes$pediatric, simulationSets = c(adultSet, childrenSet), workflowFolder = "Pediatric-Range", createWordReport = FALSE ) inactivateWorkflowTasks(pediatricWorkflow) activateWorkflowTasks(pediatricWorkflow, tasks = AllAvailableTasks$plotDemography) displayedParameters <- list( Age = "Organism|Age", Weight = "Organism|Weight", Gender = "Gender" ) setYParametersForDemographyPlot(workflow = pediatricWorkflow, parameters = displayedParameters) # Set up 5 bins for all the range plots pediatricWorkflow$plotDemography$settings$bins <- 5 pediatricWorkflow$runWorkflow()
cat(includeReportFromWorkflow(pediatricWorkflow))
# Remove the workflow folders unlink(pediatricWorkflow$workflowFolder, recursive = TRUE)
Note that the algorithm that calculates the bin edges based on the number of bins aims at including the same number of points in each bin.
It is also possible to update default aggregation settings in order to update binning but also computed and displayed aggregation values using the following functions:
By default, demography range plots are displayed as stair step range plots.
It is however possible to connect the aggregated values to get a continuous range plot by setting the stairstep
field to FALSE
as shown below.
workflow$plotDemography$settings$stairstep <- FALSE
In the example below, continuous range plots are included within a pediatric workflow:
Code
pediatricWorkflow <- PopulationWorkflow$new( workflowType = PopulationWorkflowTypes$pediatric, simulationSets = c(adultSet, childrenSet), workflowFolder = "Pediatric-Range", createWordReport = FALSE ) inactivateWorkflowTasks(pediatricWorkflow) activateWorkflowTasks(pediatricWorkflow, tasks = AllAvailableTasks$plotDemography) displayedParameters <- list( Age = "Organism|Age", Weight = "Organism|Weight", Gender = "Gender" ) setYParametersForDemographyPlot(workflow = pediatricWorkflow, parameters = displayedParameters) # Set up continuous line between bins with stairstep settings being FALSE pediatricWorkflow$plotDemography$settings$stairstep <- FALSE pediatricWorkflow$runWorkflow()
cat(includeReportFromWorkflow(pediatricWorkflow))
# Remove the workflow folders unlink(pediatricWorkflow$workflowFolder, recursive = TRUE)
By default, demography histogram plots dodge the bars to prevent bars masking one another.
It is however possible to remove this settings in case the bars do not overlap by setting the dodge
field to FALSE
as shown below.
workflow$plotDemography$settings$dodge <- FALSE
In the example below, dodging is removed within a parallel comparison workflow:
Code
parallelWorkflow <- PopulationWorkflow$new( workflowType = PopulationWorkflowTypes$parallelComparison, simulationSets = c(adultSet, childrenSet), workflowFolder = "Parallel-Histograms", createWordReport = FALSE ) inactivateWorkflowTasks(parallelWorkflow) activateWorkflowTasks(parallelWorkflow, tasks = AllAvailableTasks$plotDemography) displayedParameters <- list( Age = "Organism|Age", Weight = "Organism|Weight", Gender = "Gender" ) setYParametersForDemographyPlot(workflow = parallelWorkflow, parameters = displayedParameters) # Turn off dodging of histogram bars parallelWorkflow$plotDemography$settings$dodge <- FALSE parallelWorkflow$runWorkflow()
cat(includeReportFromWorkflow(parallelWorkflow))
# Remove the workflow folders unlink(parallelWorkflow$workflowFolder, recursive = TRUE)
Pediatric workflow range plots compare simulation sets to the reference simulation set (#17).
In cases where distributions in xParameters
between populations are comparable, it may be more appropriate to display binned distributions of the yParameters
of the reference population.
As a consequence, it is possible to define such feature by setting the referenceGlobalRange
field to FALSE
as illustrated below.
workflow$plotDemography$settings$referenceGlobalRange <- FALSE
In the example below, reference simulation set is displayed with same bins as the simulation set to compare:
Code
pediatricWorkflow <- PopulationWorkflow$new( workflowType = PopulationWorkflowTypes$pediatric, simulationSets = c(adultSet, childrenSet), workflowFolder = "Pediatric-Range", createWordReport = FALSE ) inactivateWorkflowTasks(pediatricWorkflow) activateWorkflowTasks(pediatricWorkflow, tasks = AllAvailableTasks$plotDemography) displayedParameters <- list( Age = "Organism|Age", Weight = "Organism|Weight", Gender = "Gender" ) setYParametersForDemographyPlot(workflow = pediatricWorkflow, parameters = displayedParameters) # Turn off reference global range pediatricWorkflow$plotDemography$settings$referenceGlobalRange <- FALSE pediatricWorkflow$runWorkflow()
cat(includeReportFromWorkflow(pediatricWorkflow))
# Remove the workflow folders unlink(pediatricWorkflow$workflowFolder, recursive = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.