This vignette will cover Monte Carlo realizations for modeling uncertainty using the rsyncrosim
package within the SyncroSim software framework. For an overview of SyncroSim and rsyncrosim
, as well as a basic usage tutorial for rsyncrosim
, see the Introduction to rsyncrosim
vignette.
helloworldUncertainty
To demonstrate how to quantify model uncertainty using the rsyncrosim
interface, we will need the helloworldUncertainty SyncroSim package. helloworldUncertainty
was designed to be a simple package for introducing iterations to SyncroSim modeling workflows. The use of iterations allows for repeated simulations, known as "Monte Carlo realizations", in which each simulation independently samples from a distribution of values.
The package takes from the user 3 inputs, mMean, mSD, and b. For each iteration, a value m, representing the slope, is sampled from a normal distribution with mean of mMean and standard deviation of mSD. The b value represents the intercept. These input values are run through a linear model, y=mt+b, where t is time, and the y value is returned as output.
{width=600px}
For more details on the different features of the helloworldUncertainty
SyncroSim package, consult the SyncroSim Enhancing a Package: Representing Uncertainty tutorial.
Before using rsyncrosim
you will first need to download and install the SyncroSim software. Versions of SyncroSim exist for both Windows and Linux.
Note: this tutorial was developed using rsyncrosim
version 2.0. To use rsyncrosim
version 2.0 or greater, SyncroSim version 3.0 or greater is required.
You will need to install the rsyncrosim
R package, either using CRAN or from the rsyncrosim
GitHub repository. Versions of rsyncrosim
are available for both Windows and Linux.
In a new R script, load the rsyncrosim
package.
# Load R package for working with SyncroSim library(rsyncrosim)
session()
Finish setting up the R environment for the rsyncrosim
workflow by creating a SyncroSim Session object. Use the session()
function to connect R to your installed copy of the SyncroSim software.
mySession <- session("path/to/install_folder") # Create a Session based SyncroSim install folder mySession <- session() # Using default install folder (Windows only) mySession # Displays the Session object
# Results of this code shown for above mySession <- session() # Using default install folder (Windows only) mySession # Displays the Session object
Use the version()
function to ensure you are using the latest version of SyncroSim.
version(mySession)
installPackage()
Install helloworldUncertainty
using the rynscrosim
function installPackage()
. This function takes a package name as input and then queries the SyncroSim package server for the specified package.
installedPackages <- packages() if (is.element( "helloworldUncertainty", installedPackages$name)) uninstallPackage( "helloworldUncertainty")
# Install helloworldUncertainty installPackage("helloworldUncertainty")
helloworldUncertainty
should now be included in the package list when we call the packages()
function:
# Get list of installed packages packages()
installedPackages <- packages() uncertaintyPkg <- installedPackages[installedPackages$name == "helloworldUncertainty", ] row.names(uncertaintyPkg) <- NULL uncertaintyPkg
When creating a new modeling workflow from scratch, we need to create objects of the following scopes:
For more information on these scopes, see the Introduction to rsyncrosim
vignette.
if (file.exists("helloworldLibrary.ssim")){ deleteLibrary("helloworldLibrary.ssim", force = TRUE) }
# Create a new library myLibrary <- ssimLibrary(name = "helloworldLibrary.ssim", session = mySession, packages = "helloworldUncertainty", overwrite = TRUE) # Open the default project myProject = project(ssimObject = myLibrary, project = "Definitions") # Create a new scenario (associated with the default project) myScenario = scenario(ssimObject = myProject, scenario = "My first scenario")
datasheet()
View the datasheets associated with your new scenario using the datasheet()
function from rsyncrosim
.
# View all datasheets associated with a library, project, or scenario datasheet(myScenario)
From the list of datasheets above, we can see that there are three datasheets specific to the helloworldUncertainty
package. Let's view the contents of the Inputs
datasheet as an R data frame.
# View the contents of the Inputs datasheet for the scenario datasheet(myScenario, name = "helloworldUncertainty_InputDatasheet")
datasheet()
and addRow()
Inputs Datasheet
Currently our input scenario datasheet is empty! We need to add some values to our Inputs
datasheet (InputDatasheet
) so we can run our model. First, assign the contents of the Inputs
datasheet to a new data frame variable using datasheet()
, then check the columns that need input values.
# Load the Inputs datasheet to an R data frame myInputDataframe <- datasheet(myScenario, name = "helloworldUncertainty_InputDatasheet") # Check the columns of the input data frame str(myInputDataframe)
The Inputs
datasheet requires three values:
mMean
: the mean of the slope normal distribution.mSD
: the standard deviation of the slope normal distribution.b
: the intercept of the linear equation.Add these values to a new data frame, then use the addRow()
function from rsyncrosim
to update the input data frame.
# Create input data and add it to the input data frame myInputRow <- data.frame(mMean = 2, mSD = 4, b = 3) myInputDataframe <- addRow(myInputDataframe, myInputRow) # Check values myInputDataframe
Finally, save the updated R data frame to a SyncroSim datasheet using saveDatasheet()
.
# Save input R data frame to a SyncroSim datasheet saveDatasheet(ssimObject = myScenario, data = myInputDataframe, name = "helloworldUncertainty_InputDatasheet")
Pipeline Datasheet
Next, we need to add data to the Pipeline datasheet. The Pipeline datasheet determines which transformers the scenarios will run and in which order. Use the code below to assign the Pipeline datasheet to a new data frame variable and check the values required by the datasheet.
# Assign contents of the Pipeline datasheet to an R data frame myPipeline <- datasheet(myScenario, name = "core_Pipeline") # Check the columns of the Pipeline data frame str(myPipeline)
The Pipeline datasheet requires 2 values:
StageNameId
: the pipeline stage (transformer). This column is a factor that has only a single level: "Hello World Uncertainty (R)". RunOrder
: the numerical order in which the stages will be run.Below, we use the addRow()
and saveDatasheet()
functions to update the Pipeline datasheet with the transformer(s) we want to run and the order in which we want to run them. In this case, there is only a single transformer available from the helloworldUncertainty
package, called "Hello World Uncertainty (R)", so we will add this transformer to the data frame and set the RunOrder
to 1
.
# Create pipeline data and add it to the pipeline data frame myPipelineRow <- data.frame(StageNameId = "Hello World Uncertainty (R)", RunOrder = 1) myPipeline <- addRow(myPipeline, myPipelineRow) # Check values myPipeline # Save Pipeline R data frame to a SyncroSim Datasheet saveDatasheet(ssimObject = myScenario, data = myPipeline, name = "core_Pipeline")
Run Control Datasheet
The Run Control
datasheet provides information about how many time steps and iterations to use in the model. Here, we set the number of iterations, as well as the minimum and maximum time steps for our model. The number of iterations we set is equivalent to the number of Monte Carlo realizations, so the greater the number of iterations, the more accurate the range of output values we will obtain. Let's take a look at the columns that need input values.
# Load Run Control datasheet to a new R data frame runSettings <- datasheet(myScenario, name = "helloworldUncertainty_RunControl") # Check the columns of the Run Control data frame str(runSettings)
The Run Control
datasheet requires the following 3 columns:
MaximumIteration
: total number of iterations to run the model for.MinimumTimestep
: the starting time point of the simulation.MaximumTimestep
: the end time point of the simulation.Note: A fourth hidden column, MinimumIteration
, also exists in the Run Control
datasheet (default=1).
We'll add this information to an R data frame and then add it to the Run Control
data frame using addRow()
. For this example, we will use only five iterations.
# Create Run Control data and add it to the Run Control data frame runSettingsRow <- data.frame(MaximumIteration = 5, MinimumTimestep = 1, MaximumTimestep = 10) runSettings <- addRow(runSettings, runSettingsRow) # Check values runSettings
Finally, save the R data frame to a SyncroSim datasheet using saveDatasheet()
.
# Save Run Control R data frame to a SyncroSim datasheet saveDatasheet(ssimObject = myScenario, data = runSettings, name = "helloworldUncertainty_RunControl")
run()
We will now run our scenario using the run()
function in rsyncrosim
.
If we have a large model and we want to parallelize the run using multiprocessing, we can modify the library-scoped "core_Multiprocessing" datasheet. Since we are using five iterations in our model, we will set the number of jobs to five so each multiprocessing core will run a single iteration.
# Load list of available library-scoped datasheets datasheet(myLibrary) # Load the library-scoped multiprocessing datasheet multiprocess <- datasheet(myLibrary, name = "core_Multiprocessing") # Check required inputs str(multiprocess) # Enable multiprocessing multiprocess$EnableMultiprocessing <- TRUE # Set maximum number of jobs to 5 multiprocess$MaximumJobs <- 5 # Save multiprocessing configuration saveDatasheet(ssimObject = myLibrary, data = multiprocess, name = "core_Multiprocessing")
Now, when we run our scenario, it will use the desired multiprocessing configuration.
# Run the first scenario we created myResultScenario <- run(myScenario)
Running the original scenario creates a new scenario object, known as a result scenario, that contains a read-only snapshot of the Inputs
datasheets, as well as the Outputs
datasheets filled with result data. We can view which scenarios are result scenarios using the scenario()
function from rsyncrosim
.
# Check that we have two scenarios, and one is a result scenario scenario(myLibrary)
datasheet()
The next step is to view the Outputs
datasheets added to the result scenario when it was run. We can load the result tables using the datasheet()
function. In this package, the datasheet containing the results is called "OutputDatasheet".
# Results of first scenario resultsSummary <- datasheet(myResultScenario, name = "helloworldUncertainty_OutputDatasheet") # View results table head(resultsSummary)
Now that we have run multiple iterations, we can visualize the uncertainty in our results. For this plot, we will plot the average y values over time, while showing the 20th and 80th percentiles.
To create a plot using the result scenario we just generated, open the current library in SyncroSim Studio and sync the updates from rsyncrosim
using the "refresh" button in the upper toolbar (circled in red below). All the updates made in rsyncrosim
should appear in SyncroSim Studio. We can now add the result scenario to the Results Viewer and create our plot. For more information on generating plots in SyncroSim Studio, see the SyncroSim tutorials on creating and customizing charts.
{width=600px}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.