In mrc-ide/rrq: Simple Redis Queue

knitr::opts_chunk$set(
  error = FALSE,
  collapse = TRUE,
  comment = "#>"
)

lang_output <- function(x, lang) {
  cat(c(sprintf("```%s", lang), x, "```"), sep = "\n")
}
make_reader <- function(filename, lang = "plain") {
  n <- 0
  force(filename)
  force(lang)
  function() {
    txt <- readLines(filename)
    if (n > 0) {
      txt <- txt[-seq_len(n)]
    }
    n <<- n + length(txt)
    lang_output(txt, lang)
  }
}

In addition to passing tasks (and results) between a controller and workers, the controller can also send "messages" to workers. This vignette shows what the possible messages do.

In order to do this, we're going to need a queue and a worker:

library(rrq)
id <- paste0("rrq:", ids::random_id(bytes = 4))
obj <- rrq_controller2(id)
rrq_default_controller_set(obj)
logdir <- tempfile()
w <- rrq_worker_spawn2(logdir = logdir)
worker_id <- w$id

On startup the worker log contains:

reader <- make_reader(file.path(logdir, worker_id))
reader()

Because one of the main effects of messages is to print to the worker logfile, we'll print this fairly often.

Messages and responses

The queue sends a message for one or more workers to process. The message has an identifier that is derived from the current time. Messages are written to a first-in-first-out queue, per worker, and are processed independently by workers who do not look to see if other workers have messages or are processing them.
As soon as a worker has finished processing any current job it will process the message (it must wait to finish a current job but will not start any further jobs).
Once the message has been processed (see below) a response will be written to a response list with the same identifier as the message.

Some messages interact with the worker timeout:

PING, ECHO, EVAL, INFO PAUSE, RESUME and REFRESH will reset the timer, as if a task had been run
TIMEOUT_SET explicitly interacts with the timer
TIMEOUT_GET does not reset the timer, reporting the remaining time
STOP causes the worker to exit, so has no interaction with the timer

`PING`

The PING message simply asks the worker to return PONG. It's useful for diagnosing communication issues because it does so little

message_id <- rrq_message_send("PING")

The message id is going to be useful for getting responses:

message_id

(this is derived from the current time, according to Redis which is the central reference point of time for the whole system).

reader()

The logfile prints:

the request for the PING (MESSAGE PING)
the value PONG to the R message stream
logging a response (RESPONSE PONG), which means that something is written to the response stream.

We can access the same bits of information in the worker log:

rrq_worker_log_tail(n = Inf)

This includes the ALIVE message as the worker comes up.

Inspecting the logs is fine for interactive use, but it's going to be more useful often to poll for a response.

We already know that our worker has a response, but we can ask anyway:

rrq_message_has_response(message_id)

Or inversely we can as what messages a given worker has responses for:

rrq_message_response_ids(worker_id)

To fetch the responses from all workers it was sent to (always returning a named list):

rrq_message_get_response(message_id)

or to fetch the response from a given worker:

rrq_message_get_response(message_id, worker_id)

The response can be deleted by passing delete = TRUE to this method:

rrq_message_get_response(message_id, worker_id, delete = TRUE)

after which recalling the message will throw an error:

rrq_message_get_response(message_id, worker_id)

There is also a timeout argument that lets you wait until a response is ready (as in rrq_task_wait()).

rrq_task_create_expr(Sys.sleep(2))
message_id <- rrq_message_send("PING")
rrq_message_get_response(
  message_id, worker_id, delete = TRUE, timeout = 10)

Looking at the log will show what went on here:

rrq_worker_log_tail(n = 4)

A task is received
2s later the task is completed
Then the message is received
Then, basically instantaneously, the message is responded to

However, because the message is only processed after the task is completed, the response takes a while to come back. Equivalently, from the worker log:

reader()

`ECHO`

This is basically like PING and not very interesting; it prints an arbitrary string to the log. It always returns "OK" as a response.

message_id <- rrq_message_send("ECHO", "hello world!")
rrq_message_get_response(message_id, worker_id, timeout = 10)

reader()

`INFO`

The INFO command refreshes and returns the worker information.

We already have a copy of the worker info; it was created when the worker started up:

rrq_worker_info()[[worker_id]]

We can force the worker to refresh:

message_id <- rrq_message_send("INFO")

Here's the new worker information, complete with an updated envir field:

rrq_message_get_response(message_id, worker_id, timeout = 10)

`EVAL`

Evaluate an arbitrary R expression, passed as a string (not as any sort of unevaluated or quoted expression). This expression is evaluated in the global environment, which is not the environment in which queued code is evaluated in.

message_id <- rrq_message_send("EVAL", "1 + 1")
rrq_message_get_response(message_id, worker_id, timeout = 10)

This could be used to evaluate code that has side effects, such as installing packages. However, due to limitations with how R loads packages the only way to update and reload a package is going to be to restart the worker.

`PAUSE` / `RESUME`

The PAUSE / RESUME messages can be used to prevent workers from picking up new work (and then allowing them to start again).

rrq_worker_status()
message_id <- rrq_message_send("PAUSE")
rrq_message_get_response(message_id, worker_id, timeout = 10)
rrq_worker_status()

Once paused workers ignore tasks, which stay on the queue:

t <- rrq_task_create_expr(runif(5))
rrq_task_status(t)

Sending a RESUME message unpauses the worker:

message_id <- rrq_message_send("RESUME")
rrq_message_get_response(message_id, worker_id, timeout = 10)
rrq_task_wait(t, 5)

`SET_TIMEOUT` / `GET_TIMEOUT`

Workers will quit after being left idle for more than a certain time; this is their timeout. Only processing tasks counts as work (not messages). You can query the timeout with GET_TIMEOUT and set it with SET_TIMEOUT. For our worker above the timeout is infinite; it will never quit:

rrq_message_send_and_wait("TIMEOUT_GET", worker_ids = worker_id)

We can set this to a finite value, in seconds:

rrq_message_send_and_wait("TIMEOUT_SET", 600, worker_ids = worker_id)

Here the timeout is set to 10 minutes (600s).

Once set, the TIMEOUT_GET returns the length of time remaining before the worker exits

rrq_message_send_and_wait("TIMEOUT_GET", worker_ids = worker_id)
Sys.sleep(5)
rrq_message_send_and_wait("TIMEOUT_GET", worker_ids = worker_id)

One useful pattern is to send work to workers, then set the timeout to zero. This means that when work is complete they will exit (almost) immediately:

ids <- rrq_task_create_bulk_call(function(x) {
  Sys.sleep(0.5)
  runif(x)
}, 1:5)
rrq_message_send("TIMEOUT_SET", 0, worker_id)
rrq_task_wait(ids)
rrq_task_results(ids)

The worker will remain idle for 60s (by default) which is the length of time that one poll for work lasts, then it will exit.

rrq_worker_status(worker_id)
rrq_message_send_and_wait("TIMEOUT_GET", worker_ids = worker_id)

rrq_destroy(timeout_worker_stop = 10)

Messages that are supported but use via wrappers:

There are other methods that are typically used via methods on the [rrq_controller] object.

REFRESH: requests that the worker refresh its evaluation environment. Typically used via rrq_worker_envir_set()
STOP: sent with a informational message as an argument, requests that the worker stop. Typically used via rrq_worker_stop()

mrc-ide/rrq documentation built on April 21, 2024, 4:18 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com