update_sim_on_cluster: Framework for updating simulations on a cluster computing...

View source: R/update_sim_on_cluster.R

update_sim_on_clusterR Documentation

Framework for updating simulations on a cluster computing system

Description

This function serves a scaffold for updating a previously-run in parallel on a cluster computing system. Like run_on_cluster, it acts as a wrapper for the code in your simulation script, organizing the code into three sections, labeled "first" (code that is run once at the start of the simulation, e.g. setting simulation levels), "main" (the simulation script, which is run repeatedly), and "last" (code to combine and summarize simulation results). This function interacts with cluster job scheduler software (e.g. Slurm or Oracle Grid Engine) to divide parallel tasks over cluster nodes. See https://avi-kenny.github.io/SimEngine/parallelization/ for an overview of how cluster parallelization works in SimEngine.

Usage

update_sim_on_cluster(
  first,
  main,
  last,
  cluster_config,
  keep_errors = TRUE,
  keep_extra = FALSE
)

Arguments

first

Code to run before executing additional simulation replicates. For example, this could include altering the simulation levels or changing nsim. This block of code, enclosed by curly braces , must first read in an existing simulation object and then make alterations to it. Global variables declared in this block will not be available when the 'main' and 'last' code blocks run.

main

Code that will run for every simulation replicate. This should be a block of code enclosed by curly braces that includes a call to update_sim. This code block will have access to the simulation object you read in the 'first' code block, but any changes made here to the simulation object will not be saved.

last

Code that will run after all additional simulation replicates have been run. This should be a block of code enclosed by curly braces that takes your simulation object (which at this point will contain both your old and new results) and do something with it, such as display your results on a graph.

cluster_config

A list of configuration options. You must specify either js (the job scheduler you are using) or tid_var (the name of the environment variable that your task ID is stored in). Run js_support() to see a list of job schedulers that are currently supported. You can optionally also specify dir, which is a character string representing a path to a directory; this directory will serve as your working directory and hold your simulation object, temporary SimEngine objects, and simulation results (this defaults to the working directory of the R script that contains your simulation code).

keep_errors

logical (TRUE by default); if TRUE, do not try to re-run simulation reps that results in errors previously; if FALSE, attempt to run those reps again

keep_extra

logical (FALSE by default); if TRUE, keep previously run simulation reps even if they exceed the current num_sim in config or are from a level that has been dropped; if FALSE, drop excess reps (starting from the last rep for that particular simulation level)

Examples

## Not run: 
# The following code creates, runs, and subsequently updates a toy simulation
# on a cluster computing environment. We include both the R code as well as
# sample BASH code for running the simulation using Oracle Grid Engine.

# This code is saved in a file called my_simulation.R
library(SimEngine)
run_on_cluster(

  first = {
    sim <- new_sim()
    create_data <- function(n) { rnorm(n) }
    sim %<>% set_script(function() {
      data <- create_data(L$n)
      return(mean(data))
    })
    sim %<>% set_levels(n=c(100,1000))
    sim %<>% set_config(num_sim=10)
  },

  main = {
    sim %<>% run()
  },

  last = {
    sim %<>% summarize()
  },

  cluster_config = list(js="ge")

)

# This code is saved in a file called run_sim.sh
# #!/bin/bash
# Rscript my_simulation.R

# The following lines of code are run on the cluster head node.
# qsub -v sim_run='first' run_sim.sh
# qsub -v sim_run='main' -t 1-20 -hold_jid 101 run_sim.sh
# qsub -v sim_run='last' -hold_jid 102 run_sim.sh

# This code is saved in a file called update_my_simulation.R. Note that it
# reads in the simulation object created above, which is saved in a file
# called "sim.rds".
library(SimEngine)
update_sim_on_cluster(

  first = {
    sim <- readRDS("sim.rds")
    sim %<>% set_levels(n = c(100,500,1000))
  },

  main = {
    sim %<>% update_sim()
  },

  last = {
    sim %<>% summarize()
  },

  cluster_config = list(js="ge")

)

# This code is saved in a file called update_sim.sh
# #!/bin/bash
# Rscript update_my_simulation.R

# The following lines of code are run on the cluster head node. Note that
# only 10 new replicates are run, since 20 of 30 simulation replicates were
# run in the original call to run_on_cluster.
# qsub -v sim_run='first' update_sim.sh
# qsub -v sim_run='main' -t 1-10 -hold_jid 104 update_sim.sh
# qsub -v sim_run='last' -hold_jid 105 update_sim.sh

## End(Not run)

Avi-Kenny/SimEngine documentation built on June 23, 2022, 11:09 a.m.