add.simulation: Create or augment a list of simulated distributions of...

Description Usage Arguments Details Value Examples

Description

add_simulation creates or augments a list of simulated distributions of summary statistics, and formats the simulation results appropriately for further use. The user does not have to think about this return format. Instead, s-he only has to think about the very simple return format of the function given as its Simulate argument. Alternatively, if the simulation function cannot be called directly by the R code, simulated distributions can be added easily using the newsimuls argument, again using a simple format (see onedistrib in the Examples).

add_reftable is a wrapper for add_simulation, enforcing nRealizations=1: see example example_reftable.

These functions can run simulations in a parallel environment. Special care is then needed to ensure that all required packages are loaded in the called processes, and required all variables and function are passed therein: check the packages and env arguments.

Usage

1
2
3
4
5
add_simulation(simulations=NULL, Simulate, par.grid=NULL, 
               nRealizations = NULL, newsimuls = NULL, 
               verbose = interactive(), nb_cores = NULL, packages = NULL, env = NULL,
               control.Simulate=NULL, cluster_args=list(), ...)
add_reftable(...)

Arguments

simulations

A list of simulations

nRealizations

The number of simulated samples of summary statistics, for each empirical distribution (each row of par.grid). If the argument is NULL, the value is obtained by Infusion.getOption. If the argument is not NULL, Infusion.options(nRealizations) is set, but restored, on exit from add_simulation, to its initial value.

Simulate

An *R* function, or the name (as a character string) of an *R* function used to generate empirical distributions of summary statistics. When an external simulation program is called, Simulate must therefore be an R function wrapping the call to the external program. The function must have a single vector as argument, matching each row of par.grid. It must return a vector of summary statistics with named vector members; or a single matrix of nRealizations simulations, in which case its rows and row names, must represent the summary statistics, it should have nRealizations columns, and nRealizations should be named integer of the form “c(as_one=.)” (see Examples).

par.grid

A data frame of which each line matches the single vector argument of Simulate.

newsimuls

If the function used to generate empirical distributions cannot be called by R, then newsimuls can be used to provide these distributions. See Details for the structure of this argument.

nb_cores

Number of cores for parallel simulation; NULL or integer value, actin as a shortcut for cluster_args$spec. If nb_cores is unnamed or has name "replic" and if the simulation function does not return a single table for all replicates (thus, if nRealizations is not a named integer of the form “c(as_one=.)”, parallelisation is over the different samples for each parameter value. For any other explicit name (e.g., nb_cores=c(foo=7)), or if nRealizations is a named integer of the form “c(as_one=.)”, parallelisation is over the parameter values (the rows of par.grid). In all cases, the progress bar is over parameter values. See Details in Infusion.options for the subtle way these different cases are distinguished in the progress bar.

cluster_args

A list of arguments, passed to makeCluster. May contain a non-null spec element, in which case the distinct nb_cores argument is ignored.

verbose

Whether to print some information or not.

...

For add_reftable: arguments passed to add_simulation. Any of the add_simulation arguments is valid, except nRealizations. For add_simulation: additional arguments passed to Simulate, beyond the parameter vector; see nsim argument of myrnorm_tab() in the Examples. These arguments should be constant through all the simulation workflow.

control.Simulate

A list, used as an exclusive alternative to “...” to pass additional arguments to Simulate, beyond the parameter vector. The list must contain the same elements as would go in the “...”.

packages

For parallel evaluation: Names of additional libraries to be loaded on the cores, necessary for Simulate evaluation.

env

For parallel evaluation: an environment containing additional objects to be exported on the cores, necessary for Simulate evaluation.

Details

The newsimuls argument should have the same structure as the return value of add_simulation itself, except that newsimuls may include only a subset of the attributes returned by add_simulation. In the reference-table case, it is thus a data frame; its required attributes are LOWER and UPPER which are named vectors giving bounds for the parameters which are variable in the whole analysis (note that the names identify these parameters in the case this information is not available otherwise from the arguments). The values in these vectors may be incorrect in the sense of failing to bound the parameters in the newsimuls, as the actual bounds are then corrected using parameter values in newsimuls and attributes from simulations. Otherwise, newsimuls should be list of matrices, each with a par attribute (see Examples). Rows of each matrix stand for simulation replicates and columns stand for the different summary statistics.

Value

If only one realization is computed for each (vector-valued) parameter, a data.frame (with additional attributes) is returned. Otherwise, the return value is an objet of class EDFlist, which is a list-with-attributes of matrices-with-attribute. Each matrix contains a simulated distribution of summary statistics for given parameters, and the "par" attribute is a 1-row data.frame of parameters. If Simulate is used, this must give all the parameters to be estimated; otherwise it must at least include all variable parameters in this or later simulations to be appended to the simulation list.

The value has the following attributes: LOWER and UPPER which are each a vector of per-parameter minima and maxima deduced from any newsimuls argument, and optionally any of the arguments Simulate, control.Simulate, packages, env, par.grid and simulations (all corresponding to input arguments when provided, except that the actual Simulate function is returned even if it was input as a name).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# example of building a list of simulations from scratch:
myrnorm <- function(mu,s2,sample.size) {
  s <- rnorm(n=sample.size,mean=mu,sd=sqrt(s2))
  return(c(mean=mean(s),var=var(s)))
}
set.seed(123)
onedistrib <- t(replicate(100,myrnorm(1,1,10))) # toy example of simulated distribution
attr(onedistrib,"par") <- c(mu=1,sigma=1,sample.size=10) ## important!
simuls <- add_simulation(NULL, Simulate="myrnorm", nRealizations=500,
                         newsimuls=list("example"=onedistrib))

# standard use: smulation over a grid of parameter values
parsp <- init_grid(lower=c(mu=2.8,s2=0.2,sample.size=40),
                   upper=c(mu=5.2,s2=3,sample.size=40))
simuls <- add_simulation(NULL, Simulate="myrnorm", nRealizations=500,
                         par.grid = parsp[1:7,])
                         
## Not run:  # example continued: parallel versions of the same
# Slow computations, notably because cluster setup is slow.

#    ... parallel over replicates, serial over par.grid rows
simuls <- add_simulation(NULL, Simulate="myrnorm", nRealizations=500,
                         par.grid = parsp[1:7,], nb_cores=7)
#    ... parallel over 'par.grid' rows
simuls <- add_simulation(NULL, Simulate="myrnorm", nRealizations=500,
                         par.grid = parsp[1:7,], nb_cores=c(foo=7))

## End(Not run)
                     
####### Example where a single 'Simulate' returns all replicates:

myrnorm_tab <- function(mu,s2,sample.size, nsim) {
  ## By default, Infusion.getOption('nRealizations') would fail on nodes!
  replicate(nsim, 
            myrnorm(mu=mu,s2=s2,sample.size=sample.size)) 
}

parsp <- init_grid(lower=c(mu=2.8,s2=0.2,sample.size=40),
                   upper=c(mu=5.2,s2=3,sample.size=40))

# 'as_one' syntax for 'Simulate' function returning a simulation table: 
simuls <- add_simulation(NULL, Simulate="myrnorm_tab",
              nRealizations=c(as_one=500),
              nsim=500, # myrnorm_tab() argument, part of the 'dots'
              par.grid=parsp)

## Not run:  # example continued: parallel versions of the same
# Slow cluster setup again
simuls <- add_simulation(NULL,Simulate="myrnorm_tab",par.grid=parsp,
              nb_cores=7L,
              nRealizations=c(as_one=500),
              nsim=500, # myrnorm_tab() argument again
              # need to export other variables used by *myrnorm_tab* to the nodes:
              env=list2env(list(myrnorm=myrnorm)))

## End(Not run)

## see main documentation page for the package for other typical usage

Infusion documentation built on Feb. 22, 2021, 9:08 a.m.