add_simulation | R Documentation |

`add_reftable`

creates or augments a reference table of simulations, and formats the results appropriately for further use. The user does not have to think about this return format. Instead, s-he only has to think about the very simple return format of the function given as its `Simulate`

argument. Alternatively, if the simulation function cannot be called directly by the R code, simulated distributions can be added easily using the `newsimuls`

argument, again using a simple format (see `onedistrib`

in the Examples). Finally, a generic data frame of simulations can be reformatted as a reference table by using only the `reftable`

argument.

`add_reftable`

is a wrapper for `add_simulation`

, enforcing `nRealizations=1`

. The distinct features of `add_simulation`

were conceived for the first workflow implemented in `Infusion`

but are somewhat obsolete now.

These functions can run simulations in a parallel environment.
Depending on the arguments, parallel or serial computation is performed. When parallelization is implied, by default a “socket” cluster, available on all operating systems. Special care is then needed to ensure that all required packages are loaded in the called processes, and that all required variables and functions are passed therein: check the `packages`

and `env`

arguments. For socket clusters, `foreach`

or `pbapply`

is called depending whether the `doSNOW`

package is attached (`doSNOW`

allows more efficient load balancing than `pbapply`

).

add_simulation(simulations=NULL, Simulate, par.grid=NULL, nRealizations = NULL, newsimuls = NULL, verbose = interactive(), nb_cores = NULL, packages = NULL, env = NULL, control.Simulate=NULL, cluster_args=list(), cl_seed=NULL, ...) add_reftable(reftable=NULL, ...) # '...' handling the same arguments as add_simulation() # except 'simulations' and 'nRealizations'

`reftable` |
Data frame: a reference table. Each row contains parameters value of a simulated realization of the data-generating process, and the simulated summary statistics.
As parameters should be told apart from statistics by Infusion functions, information about parameter names should be attached to the |

`simulations` |
Same features as |

`nRealizations` |
The number of simulated samples of summary statistics, for each empirical distribution (each row of |

`Simulate` |
An *R* function, or the name (as a character string) of an *R* function used to generate empirical distributions of summary statistics. When an external simulation program is called, |

`par.grid` |
A data frame of which each line matches the single vector argument of |

`newsimuls` |
If the function used to generate empirical distributions cannot be called by R, then |

`nb_cores` |
Number of cores for parallel simulation; |

`cluster_args` |
A list of arguments, passed to |

`verbose` |
Whether to print some information or not. |

`...` |
For |

`control.Simulate` |
A list, used as an exclusive alternative to “...” to pass additional arguments to |

`packages` |
For parallel evaluation: Names of additional libraries to be loaded on the cores, necessary for |

`env` |
For parallel evaluation: an environment containing additional objects to be exported on the cores, necessary for |

`cl_seed` |
(all parallel contexts:) Integer, or NULL. If an integer, it is used to initialize |

The `newsimuls`

argument should have the same structure as the return value of the function itself, except that `newsimuls`

may include only a subset of the attributes returned by the function. **In the reference-table case**, it is thus a data frame; its required attributes are `LOWER`

and `UPPER`

which are named vectors giving bounds for the parameters which are variable in the whole analysis (note that the names identify these parameters in the case this information is not available otherwise from the arguments). The values in these vectors may be incorrect in the sense of failing to bound the parameters in the `newsimuls`

, as the actual bounds are then corrected using parameter values in `newsimuls`

and attributes from `simulations`

. **Otherwise**, `newsimuls`

should be list of matrices, each with a `par`

attribute (see Examples). Rows of each matrix stand for simulation replicates and columns stand for the different summary statistics.

When `nRealizations`

>1L, if `nb_cores`

is unnamed or has name `"replic"`

and if the simulation function does not return a single table for all replicates (thus, if `nRealizations`

is **not** a named integer of the form “`c(as_one=.)`

”, parallelisation is over the different samples for each parameter value (and the seed of the random number generator is not controlled in a parallel context). For any other explicit name (e.g., `nb_cores=c(foo=7)`

), or if `nRealizations`

is a named integer of the form “`c(as_one=.)`

”, parallelisation is over the parameter values (the rows of `par.grid`

). In all cases, the progress bar is over parameter values. See Details in `Infusion.options`

for the subtle way these different cases are distinguished in the progress bar.

Using a FORK cluster with `nRealizations`

>1 is warned as unreliable: in particular, anyone trying this combination should check whether other desired controls, such as random generator seed, or progress bar are effective.

If only one realization is computed for each (vector-valued) parameter, a data.frame (with additional attributes) is returned.
Otherwise, the return value is an object of class `EDFlist`

, which is a list-with-attributes of matrices-with-attribute. Each matrix contains a simulated distribution of summary statistics for given parameters, and the `"par"`

attribute is a 1-row data.frame of parameters. If `Simulate`

is used, this must give all the parameters to be estimated; otherwise it must at least include all variable parameters in this **or later** simulations to be appended to the simulation list.

The value has the following attributes: `LOWER`

and `UPPER`

which are each a vector of per-parameter minima and maxima deduced from any `newsimuls`

argument, and optionally any of the arguments `Simulate, control.Simulate, packages, env, par.grid`

and `simulations`

(all corresponding to input arguments when provided, except that the actual `Simulate`

function is returned even if it was input as a name).

# example of building a list of simulations from scratch: myrnorm <- function(mu,s2,sample.size) { s <- rnorm(n=sample.size,mean=mu,sd=sqrt(s2)) return(c(mean=mean(s),var=var(s))) } set.seed(123) onedistrib <- t(replicate(100,myrnorm(1,1,10))) # toy example of simulated distribution attr(onedistrib,"par") <- c(mu=1,sigma=1,sample.size=10) ## important! simuls <- add_simulation(NULL, Simulate="myrnorm", nRealizations=500, newsimuls=list("example"=onedistrib)) # standard use: smulation over a grid of parameter values parsp <- init_grid(lower=c(mu=2.8,s2=0.2,sample.size=40), upper=c(mu=5.2,s2=3,sample.size=40)) simuls <- add_simulation(NULL, Simulate="myrnorm", nRealizations=500, par.grid = parsp[1:7,]) ## Not run: # example continued: parallel versions of the same # Slow computations, notably because cluster setup is slow. # ... parallel over replicates, serial over par.grid rows # => cl_seed has no effect and can be ignored simuls <- add_simulation(NULL, Simulate="myrnorm", nRealizations=500, par.grid = parsp[1:7,], nb_cores=7) # # ... parallel over 'par.grid' rows => cl_seed is effective simuls <- add_simulation(NULL, Simulate="myrnorm", nRealizations=500, cl_seed=123, # for repeatable results par.grid = parsp[1:7,], nb_cores=c(foo=7)) ## End(Not run) ####### Example where a single 'Simulate' returns all replicates: myrnorm_tab <- function(mu,s2,sample.size, nsim) { ## By default, Infusion.getOption('nRealizations') would fail on nodes! replicate(nsim, myrnorm(mu=mu,s2=s2,sample.size=sample.size)) } parsp <- init_grid(lower=c(mu=2.8,s2=0.2,sample.size=40), upper=c(mu=5.2,s2=3,sample.size=40)) # 'as_one' syntax for 'Simulate' function returning a simulation table: simuls <- add_simulation(NULL, Simulate="myrnorm_tab", nRealizations=c(as_one=500), nsim=500, # myrnorm_tab() argument, part of the 'dots' par.grid=parsp) ## Not run: # example continued: parallel versions of the same. # Slow cluster setup again simuls <- add_simulation(NULL,Simulate="myrnorm_tab",par.grid=parsp, nb_cores=7L, nRealizations=c(as_one=500), nsim=500, # myrnorm_tab() argument again cl_seed=123, # for repeatable results # need to export other variables used by *myrnorm_tab* to the nodes: env=list2env(list(myrnorm=myrnorm))) ## End(Not run) ## see main documentation page for the package for other typical usage

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.