```{css echo=FALSE} img { border: 0px !important; margin: 2em 2em 2em 2em !important; } code { border: 0px !important; }
```r knitr::opts_chunk$set( cache = FALSE, echo = TRUE, collapse = TRUE, comment = "#>" ) options(clustermq.scheduler = "local") library(clustermq)
The main worker functions are wrapped in an R6 class with the name of QSys
.
This provides a standardized API to the lower-level
messages
that are sent via ZeroMQ.
The base class itself is derived in scheduler classes that add the required functions for submitting and cleaning up jobs:
+ QSys |- Multicore |- LSF + SGE |- PBS |- Torque |- etc.
The user-visible object is a worker Pool
that wraps this, and will eventually
allow to manage different workers.
A pool of workers can be created using the workers()
function, which
instantiates a Pool
object of the corresponding QSys
-derived scheduler
class. See ?workers
for details.
# start up a pool of three workers using the default scheduler w = workers(n_jobs=3) # if we make an unclean exit for whatever reason, clean up the jobs on.exit(w$cleanup())
For workers that are started up via a scheduler, we do not know which machine they will run on. This is why we start up every worker with a TCP/IP address of the master socket that will distribute work.
This is achieved by the call to R common to all schedulers:
```{sh eval=FALSE} R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'
#### Worker communication On the master's side, we wait until a worker connects: ```r msg = w$recv() # this will block until a worker is ready
We can then send any expression to be evaluated on the worker using the send
method:
w$send(expression, ...)
After the expression (in ...
), any variables that should be passed along with
the call can be added. For batch processing that clustermq
usually does, this
command is work_chunk
, where the chunk
data is added:
w$send(clustermq:::work_chunk(chunk, fun, const, rettype, common_seed), chunk = chunk(iter, submit_index))
We can add any number of objects to a worker environment using the env
method:
w$env(object=value, ...)
This will also invisibly return a data.frame
with all objects currently in
the environment. If a user wants to inspect the environment without changing it
they can call w$env()
without arguments. The environment will be propagated
to all workers automatically in a greedy fashion.
Putting the above together in an event loop, we get what is essentially
implemented in master
. Note that your expression
to evaluate needs to track
its origin (this may change in the future).
w = workers(3) on.exit(w$cleanup()) w$env(...) while (we have new work to send || jobs pending) { res = w$recv() # handle result if (more work) w$send(expression, ...) else w$send_shutdown() }
A loop of a similar structure can be used to extend clustermq
. As an example,
this was done by the targets
package.
Communication between the master
(main event loop) and workers (QSys
base
class) is organised in messages. These are chunks of serialized data sent via
ZeroMQ's protocol (ZMTP). The parts of each message are called frames.
The master requests an evaluation in a message with X frames (direct) or Y if proxied. This is all handled by clustermq internally.
wlife_t
)If using a proxy, this will be followed by a SEXP
that contains variable
names the proxy should add before forwarding to the worker.
A worker evaluates the call using the R C API:
R_tryEvalSilent(cmd, env, &err);
If an error occurs in this evaluation will be returned as a structure with
class worker_error
. If a developer wants to catch errors and warnings in a
more fine-grained manner, it is recommended to add their own callingHandlers
to cmd
(as clustermq does work its work_chunk
).
The result of this evaluation is then returned in a message with four (direct) or five (proxied) frames:
ZMQ_REQ
socket)ZMQ_REQ
socket)wlife_t
) that is handled internally by clustermqSEXP
), visible to the userIf using a worker via SSH, these frames will be preceded by a routing identify frame that is handled internally by ZeroMQ and added or peeled off by the proxy.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.