This vignette will cover Monte Carlo realizations for modeling uncertainty using the rsyncrosim package within the SyncroSim software framework. For an overview of SyncroSim and rsyncrosim, as well as a basic usage tutorial for rsyncrosim, see the Introduction to rsyncrosim vignette.

SyncroSim Package: helloworldUncertainty

To demonstrate how to quantify model uncertainty using the rsyncrosim interface, we will need the helloworldUncertainty SyncroSim package. helloworldUncertainty was designed to be a simple package for introducing iterations to SyncroSim modeling workflows. The use of iterations allows for repeated simulations, known as "Monte Carlo realizations", in which each simulation independently samples from a distribution of values.

The package takes from the user 3 inputs, mMean, mSD, and b. For each iteration, a value m, representing the slope, is sampled from a normal distribution with mean of mMean and standard deviation of mSD. The b value represents the intercept. These input values are run through a linear model, y=mt+b, where t is time, and the y value is returned as output.

Infographic of helloworldUncertainty package{width=600px}

For more details on the different features of the helloworldUncertainty SyncroSim package, consult the SyncroSim Enhancing a Package: Representing Uncertainty tutorial.

Setup

Install SyncroSim

Before using rsyncrosim you will first need to download and install the SyncroSim software. Versions of SyncroSim exist for both Windows and Linux.

Note: this tutorial was developed using rsyncrosim version 2.0. To use rsyncrosim version 2.0 or greater, SyncroSim version 3.0 or greater is required.

Installing and loading R packages

You will need to install the rsyncrosim R package, either using CRAN or from the rsyncrosim GitHub repository. Versions of rsyncrosim are available for both Windows and Linux.

In a new R script, load the rsyncrosim package.

# Load R package for working with SyncroSim
library(rsyncrosim)

Connecting R to SyncroSim using session()

Finish setting up the R environment for the rsyncrosim workflow by creating a SyncroSim Session object. Use the session() function to connect R to your installed copy of the SyncroSim software.

mySession <- session("path/to/install_folder")      # Create a Session based SyncroSim install folder
mySession <- session()                              # Using default install folder (Windows only)
mySession                                           # Displays the Session object
# Results of this code shown for above
mySession <- session()                              # Using default install folder (Windows only)
mySession                                           # Displays the Session object

Use the version() function to ensure you are using the latest version of SyncroSim.

version(mySession)

Installing SyncroSim packages using installPackage()

Install helloworldUncertainty using the rynscrosim function installPackage(). This function takes a package name as input and then queries the SyncroSim package server for the specified package.

installedPackages <- packages()
if (is.element(
  "helloworldUncertainty", installedPackages$name)) uninstallPackage(
    "helloworldUncertainty")
# Install helloworldUncertainty
installPackage("helloworldUncertainty")

helloworldUncertainty should now be included in the package list when we call the packages() function:

# Get list of installed packages
packages()
installedPackages <- packages()
uncertaintyPkg <- installedPackages[installedPackages$name == "helloworldUncertainty", ]
row.names(uncertaintyPkg) <- NULL
uncertaintyPkg

Create a modeling workflow

When creating a new modeling workflow from scratch, we need to create objects of the following scopes:

For more information on these scopes, see the Introduction to rsyncrosim vignette.

Set up library, project, and scenario

if (file.exists("helloworldLibrary.ssim")){
  deleteLibrary("helloworldLibrary.ssim", force = TRUE)
}
# Create a new library
myLibrary <- ssimLibrary(name = "helloworldLibrary.ssim",
                         session = mySession,
                         packages = "helloworldUncertainty",
                         overwrite = TRUE)

# Open the default project
myProject = project(ssimObject = myLibrary, project = "Definitions")

# Create a new scenario (associated with the default project)
myScenario = scenario(ssimObject = myProject, scenario = "My first scenario")

View model inputs using datasheet()

View the datasheets associated with your new scenario using the datasheet() function from rsyncrosim.

# View all datasheets associated with a library, project, or scenario
datasheet(myScenario)

From the list of datasheets above, we can see that there are three datasheets specific to the helloworldUncertainty package. Let's view the contents of the Inputs datasheet as an R data frame.

# View the contents of the Inputs datasheet for the scenario
datasheet(myScenario, name = "helloworldUncertainty_InputDatasheet")

Configure model inputs using datasheet() and addRow()

Inputs Datasheet

Currently our input scenario datasheet is empty! We need to add some values to our Inputs datasheet (InputDatasheet) so we can run our model. First, assign the contents of the Inputs datasheet to a new data frame variable using datasheet(), then check the columns that need input values.

# Load the Inputs datasheet to an R data frame
myInputDataframe <- datasheet(myScenario,
                              name = "helloworldUncertainty_InputDatasheet")

# Check the columns of the input data frame
str(myInputDataframe)

The Inputs datasheet requires three values:

Add these values to a new data frame, then use the addRow() function from rsyncrosim to update the input data frame.

# Create input data and add it to the input data frame
myInputRow <- data.frame(mMean = 2, mSD = 4, b = 3)
myInputDataframe <- addRow(myInputDataframe, myInputRow)

# Check values
myInputDataframe

Finally, save the updated R data frame to a SyncroSim datasheet using saveDatasheet().

# Save input R data frame to a SyncroSim datasheet
saveDatasheet(ssimObject = myScenario, data = myInputDataframe,
              name = "helloworldUncertainty_InputDatasheet")

Pipeline Datasheet

Next, we need to add data to the Pipeline datasheet. The Pipeline datasheet determines which transformers the scenarios will run and in which order. Use the code below to assign the Pipeline datasheet to a new data frame variable and check the values required by the datasheet.

# Assign contents of the Pipeline datasheet to an R data frame
myPipeline <- datasheet(myScenario,
                        name = "core_Pipeline")

# Check the columns of the Pipeline data frame
str(myPipeline)

The Pipeline datasheet requires 2 values:

Below, we use the addRow() and saveDatasheet() functions to update the Pipeline datasheet with the transformer(s) we want to run and the order in which we want to run them. In this case, there is only a single transformer available from the helloworldUncertainty package, called "Hello World Uncertainty (R)", so we will add this transformer to the data frame and set the RunOrder to 1.

# Create pipeline data and add it to the pipeline data frame
myPipelineRow <- data.frame(StageNameId = "Hello World Uncertainty (R)", RunOrder = 1)
myPipeline <- addRow(myPipeline, myPipelineRow)

# Check values
myPipeline

# Save Pipeline R data frame to a SyncroSim Datasheet
saveDatasheet(ssimObject = myScenario, data = myPipeline,
              name = "core_Pipeline")

Run Control Datasheet

The Run Control datasheet provides information about how many time steps and iterations to use in the model. Here, we set the number of iterations, as well as the minimum and maximum time steps for our model. The number of iterations we set is equivalent to the number of Monte Carlo realizations, so the greater the number of iterations, the more accurate the range of output values we will obtain. Let's take a look at the columns that need input values.

# Load Run Control datasheet to a new R data frame
runSettings <- datasheet(myScenario, name = "helloworldUncertainty_RunControl")

# Check the columns of the Run Control data frame
str(runSettings)

The Run Control datasheet requires the following 3 columns:

Note: A fourth hidden column, MinimumIteration, also exists in the Run Control datasheet (default=1).

We'll add this information to an R data frame and then add it to the Run Control data frame using addRow(). For this example, we will use only five iterations.

# Create Run Control data and add it to the Run Control data frame
runSettingsRow <- data.frame(MaximumIteration = 5,
                             MinimumTimestep = 1,
                             MaximumTimestep = 10)
runSettings <- addRow(runSettings, runSettingsRow)

# Check values
runSettings

Finally, save the R data frame to a SyncroSim datasheet using saveDatasheet().

# Save Run Control R data frame to a SyncroSim datasheet
saveDatasheet(ssimObject = myScenario, 
              data = runSettings,
              name = "helloworldUncertainty_RunControl")

Run scenarios

Setting run parameters with run()

We will now run our scenario using the run() function in rsyncrosim.

If we have a large model and we want to parallelize the run using multiprocessing, we can modify the library-scoped "core_Multiprocessing" datasheet. Since we are using five iterations in our model, we will set the number of jobs to five so each multiprocessing core will run a single iteration.

# Load list of available library-scoped datasheets
datasheet(myLibrary)

# Load the library-scoped multiprocessing datasheet
multiprocess <- datasheet(myLibrary, name = "core_Multiprocessing")

# Check required inputs
str(multiprocess)

# Enable multiprocessing
multiprocess$EnableMultiprocessing <- TRUE

# Set maximum number of jobs to 5
multiprocess$MaximumJobs <- 5

# Save multiprocessing configuration
saveDatasheet(ssimObject = myLibrary, 
              data = multiprocess, 
              name = "core_Multiprocessing")

Now, when we run our scenario, it will use the desired multiprocessing configuration.

# Run the first scenario we created
myResultScenario <- run(myScenario)

Running the original scenario creates a new scenario object, known as a result scenario, that contains a read-only snapshot of the Inputs datasheets, as well as the Outputs datasheets filled with result data. We can view which scenarios are result scenarios using the scenario() function from rsyncrosim.

# Check that we have two scenarios, and one is a result scenario
scenario(myLibrary)

View results

Viewing results with datasheet()

The next step is to view the Outputs datasheets added to the result scenario when it was run. We can load the result tables using the datasheet() function. In this package, the datasheet containing the results is called "OutputDatasheet".

# Results of first scenario
resultsSummary <- datasheet(myResultScenario,
                            name = "helloworldUncertainty_OutputDatasheet")

# View results table
head(resultsSummary)

Plotting uncertainty in SyncroSim Studio

Now that we have run multiple iterations, we can visualize the uncertainty in our results. For this plot, we will plot the average y values over time, while showing the 20th and 80th percentiles.

To create a plot using the result scenario we just generated, open the current library in SyncroSim Studio and sync the updates from rsyncrosim using the "refresh" button in the upper toolbar (circled in red below). All the updates made in rsyncrosim should appear in SyncroSim Studio. We can now add the result scenario to the Results Viewer and create our plot. For more information on generating plots in SyncroSim Studio, see the SyncroSim tutorials on creating and customizing charts.

Using rsyncrosim with SyncroSim Studio to plot uncertainty{width=600px}



syncrosim/rsyncrosim documentation built on Oct. 18, 2024, 1:29 a.m.