View source: R/run_on_cluster.R
run_on_cluster | R Documentation |
This function allows for simulations to be run in parallel on a
cluster computing system (CCS). It acts as a wrapper for the code in your
simulation script, organizing the code into three sections, labeled
"first" (code that is run once at the start of the simulation, e.g.
setting simulation levels), "main" (running the simulation script via
run
)), and "last" (usually code to process or summarize
simulation results). This function interacts with cluster job scheduler
software (e.g. Slurm or Oracle Grid Engine) to divide parallel tasks over
cluster nodes. See
https://avi-kenny.github.io/SimEngine/parallelization/ for a
detailed overview of how CCS parallelization works in SimEngine.
run_on_cluster(first, main, last, cluster_config)
first |
Code to run at the start of a simulation. This should be a block of code enclosed by curly braces that creates a simulation object. Put everything you need in the simulation object, since global variables declared in this block will not be available when the 'main' and 'last' code blocks run. |
main |
Code that will run for every simulation replicate. This should be
a block of code enclosed by curly braces , and will almost always
contain only a single call to the |
last |
Code that will run after all simulation replicates have been run. This should be a block of code enclosed by curly braces that takes your simulation object (which at this point will contain your results) and do something with it, such as display your results on a graph. |
cluster_config |
A list of configuration options. You must specify
either |
## Not run:
# The following is a toy simulation that could be run on a cluster computing
# environment. It runs 10 replicates of 2 simulation levels as 20 separate
# cluster jobs, and then summarizes the results. This function is designed to
# be used in conjunction with cluster job scheduler software (e.g. Slurm or
# Oracle Grid Engine). We include both the R code as well as sample BASH code
# for running the simulation using Oracle Grid Engine.
# This code is saved in a file called my_simulation.R
library(SimEngine)
run_on_cluster(
first = {
sim <- new_sim()
create_data <- function(n) { rnorm(n) }
sim %<>% set_script(function() {
data <- create_data(L$n)
return(list("x"=mean(data)))
})
sim %<>% set_levels(n=c(100,1000))
sim %<>% set_config(num_sim=10)
},
main = {
sim %<>% run()
},
last = {
sim %>% summarize()
},
cluster_config = list(js="ge")
)
# This code is saved in a file called run_sim.sh
# #!/bin/bash
# Rscript my_simulation.R
# The following lines of code are run on the cluster head node.
# qsub -v sim_run='first' run_sim.sh
# qsub -v sim_run='main' -t 1-20 -hold_jid 101 run_sim.sh
# qsub -v sim_run='last' -hold_jid 102 run_sim.sh
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.