Carry out parallel Monte Carlo simulation on R slaves spawned by using slavedaemon.R script and all executed results are returned back to master.

1 2 |

`n` |
sample size. |

`rand.gen` |
the random data generating function. See the details section |

`rand.arg` |
additional argument list to |

`statistic` |
the statistic function to be simulated. See the details section |

`nsim` |
the number of simulation carried on a slave which is counted as one slave job. |

`run` |
the number of looping. See the details section. |

`slaveinfo` |
if TRUE, the numbers of jobs finished by slaves will be displayed. |

`sim.seq` |
if reproducing the same simulation is desirable, set it to the integer vector .mpi.parSim generated in previous simulation. |

`simplify` |
logical; should the result be simplified to a vector or matrix if possible? |

`comm` |
a communicator number |

`...` |
optional arguments to |

It is assumed that one simulation is carried out as
`statistic(rand.gen(n))`

, where `rand.gen(n)`

can return any
values as long as `statistic`

can take them. Additional arguments can
be passed to `rand.gen`

by `rand.arg`

as a list. Optional
arguments can also be passed to `statistic`

by the argument
`...`

.

Each slave job consists of `replicate(nsim,statistic(rand.gen(n)))`

,
i.e., each job runs `nsim`

number of simulation. The returned values
are transported from slaves to master.

The total number of simulation (TNS) is calculated as follows. Let
slave.num be the total number of slaves in a `comm`

and it is
`mpi.comm.size(comm)-1`

. Then TNS=slave.num*nsim*run and the total
number of slave jobs is slave.num*run, where `run`

is the number of
looping from master perspective. If run=1, each slave will run one slave
job. If run=2, each slave will run two slaves jobs on average, and so on.

The purpose of using `run`

has two folds. It allows a tuneup
of slave job size and total number of slave jobs to deal with two
different cluster environments. On a cluster of slaves with equal CPU
power, `run=1`

is often enough. But if `nsim`

is too big, one
can set `run=2`

and the slave jog size to be `nsim/2`

so that
TNS=slave.num*(nsim/2)*(2*run). This may improve R computation
efficiency slightly. On a cluster of slaves with different CPU power, one
can choose a big value of `run`

and a small value of `nsim`

so that master can dispatch more jobs to slaves who run faster than
others. This will keep all slaves busy so that load balancing is
achieved.

The sequence of slaves who deliver results to master are saved into
`.mpi.parSim`

. It keeps track which part of results done by which slaves.
`.mpi.parSim`

can be used to reproduce the same simulation result if the same
seed is used and the argument `sim.seq`

is equal to `.mpi.parSim`

.

See the warning section before you use `mpi.parSim`

.

The returned values depend on values returned by `replicate`

of `statistic(rand.gen(n))`

and the total number of simulation
(TNS). If `statistic`

returns a single value, then the result is a
vector of length TNS. If `statistic`

returns a vector (list) of
length `nrow`

, then the result is a matrix of dimension
`c(nrow, TNS)`

.

It is assumed that a parallel RNG is used on all slaves. Run
`mpi.setup.rngstream`

on the master to set up a parallel RNG. Though `mpi.parSim`

works without a parallel RNG, the quality of simulation is not guarantied.

`mpi.parSim`

will automatically transfer `rand.gen`

and `statistic`

to slaves. However, any functions that
`rand.gen`

and `statistic`

reply on but are not on slaves
must be transfered to slaves before using `mpi.parSim`

. You
can use `mpi.bcast.Robj2slave`

for that purpose. The same is
applied to required packages or C/Fortran codes. You can use either
`mpi.bcast.cmd`

or put `required(package)`

and/or
`dyn.load(so.lib)`

into `rand.gen`

and `statistic`

.

If `simplify`

is TRUE, sapply style simplication is applied. Otherwise a list of length
slave.num*run is returned.

Hao Yu

`mpi.setup.rngstream`

`mpi.bcast.cmd`

`mpi.bcast.Robj2slave`

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.