observer: Creates an observer for an rrqueue

Description Usage Arguments Details Methods

Description

Creates an observer for an rrqueue. This is the "base class" for a couple of different objects in rrqueue; notably the queue object. So any method listed here also works within queue objects.

Usage

1
2
observer(queue_name = NULL, redis_host = "127.0.0.1", redis_port = 6379,
  config = NULL)

Arguments

queue_name

Name of the queue, if not given then it will check with the given Redis server to see if there is just a single queue known. In that case we connect to that queue. Otherwise we error and list possible queues.

redis_host

Redis hostname

redis_port

Redis port number

config

Configuration file of key/value pairs in yaml format. See the package README for an example. If given, additional arguments to this function override values in the file which in turn override defaults of this function.

Details

Most of the methods of the observer object are extremely simple and involve fetching information from the database about the state of tasks, environments and workers.

The method and argument names try to give hints about the sort of things they expect; a method asking for task_id expects a single task identifier, while those asking for task_ids expect a vector of task identifiers (and if they have a default NULL then will default to returning information for all task identifiers). Similarly, a method starting task_ applies to one task while a method starting tasks_ applies to multiple.

Methods

tasks_list

Return a vector of known task ids.

Usage: tasks_list()

Value: A character vector

tasks_status

Returns a named character vector indicating the task status.

Usage: tasks_status(task_ids = NULL, follow_redirect = FALSE)

Arguments:

task_ids

Optional vector of task identifiers. If omitted all tasks known to rrqueue will be used.

follow_redirect

should we follow redirects to get the status of any requeued tasks?

Value: A named character vector; the names will be the task ids, and the values are the status of each task. Possible status values are

PENDING

queued, but not run by a worker

RUNNING

being run on a worker, but not complete

COMPLETE

task completed successfully

ERROR

task completed with an error

ORPHAN

task orphaned due to loss of worker

REDIRECT

orphaned task has been redirected

MISSING

task not known (deleted, or never existed)

tasks_overview

High-level overview of the tasks in the queue; the number of tasks in each status.

Usage: tasks_overview()

tasks_times

returns a summary of times for a set of tasks

Usage: tasks_times(task_ids = NULL, unit_elapsed = "secs")

Arguments:

task_ids

Optional vector of task identifiers. If omitted all tasks known to rrqueue will be used.

unit_elapsed

Unit to use in computing elapsed times. The default is to use "secs". This is passed through to difftime so the units there are available and are "auto", "secs", "mins", "hours", "days", "weeks".

Value: A data.frame, one row per task, with columns

task_id

The task id

submitted

Time the task was submitted

started

Time the task was started, or NA if waiting

finished

Time the task was completed, or NA if waiting or running

waiting

Elapsed time spent waiting

running

Elapsed time spent running, or NA if waiting

idle

Elapsed time since finished, or NA if waiting or running

tasks_envir

returns the mapping of tasks to environmen

Usage: tasks_envir(task_ids = NULL)

Arguments:

task_ids

Optional vector of task identifiers. If omitted all tasks known to rrqueue will be used.

Value: A named character vector; names are the task ids and the value is the environment id associated with that task.

task_get

returns a task object associated with a given task identifier. This can be used to interrogate an individual task. See the help for task objects for more about these objects.

Usage: task_get(task_id)

Arguments:

task_id

A single task identifier

task_result

Get the result for a single task

Usage: task_result(task_id, follow_redirect = FALSE)

Arguments:

task_id

A single task identifier

follow_redirect

should we follow redirects to get the status of any requeued task?

tasks_groups_list

Returns list of groups known to rrqueue. Groups are assigned during task creation, or through the tasks_set_group method of link{queue}.

Usage: tasks_groups_list()

tasks_in_groups

Returns a list of tasks belonging to any of the groups listed.

Usage: tasks_in_groups(groups)

Arguments:

groups

A character vector of one or more groups (use tasks_groups_list to get a list of valid groups).

tasks_lookup_group

Look up the group for a set of tasks

Usage: tasks_lookup_group(task_ids = NULL)

Arguments:

task_ids

Optional vector of task identifiers. If omitted all tasks known to rrqueue will be used.

Value: A named character vector; names refer to task ids and the value is the group (or NA if no group is set for that task id).

task_bundle_get

Return a "bundle" of tasks that can be operated on together; see task_bundle

Usage: task_bundle_get(groups = NULL, task_ids = NULL)

Arguments:

groups

A vector of groups to include in the bundle

task_ids

A vector of task ids in the bundle. Unlike all other uses of task_ids here, only one of groups or task_ids can be provided, so if task_ids=NULL then task_ids is ignored and groups is used.

envirs_list

Return a vector of all known environment ids in this queue.

Usage: envirs_list()

envirs_contents

Return a vector of the environment contents

Usage: envirs_contents(envir_ids = NULL)

Arguments:

envir_ids

Vector of environment ids. If omitted then all environments in this queue are used.

Value: A list, each element of which is a list of elements

packages

a vector of packages loaded

sources

a vector of files explicitly sourced

source_files

a vector of files sourced including their hashes. This includes and files detected to be sourced by another file

envir_workers

Determine which workers are known to be able to process tasks in a particular environment.

Usage: envir_workers(envir_id, worker_ids = NULL)

Arguments:

envir_id

A single environment id

worker_ids

Optional vector of worker identifiers. If omitted all workers known to rrqueue will be used (currently running workers only).

Value: A named logical vector; TRUE if a worker can use an environment, named by the worker identifers.

workers_len

Number of workers that have made themselves known to rrqueue. There are situations where this is an overestimate and that may get fixed at some point.

Usage: workers_len()

workers_list

Returns a vector of all known worker identifiers (may include workers that have crashed).

Usage: workers_list()

workers_list_exited

Returns a vector of workers that are known to have exited. Workers leave behind most of the interesting bits of logs, times, etc, so these identifiers are useful for asking what they worked on.

Usage: workers_list_exited()

workers_status

Returns a named character vector indicating the task status.

Usage: workers_status(worker_ids = NULL)

Arguments:

worker_ids

Optional vector of worker identifiers. If omitted all workers known to rrqueue will be used (currently running workers only).

Value: A named character vector; the names will be the task ids, and the values are the status of each task. Possible status values are

IDLE

worker is idle

BUSY

worker is running a task

LOST

worker has been identified as lost by the workers_identify_lost of queue.

EXITED

worker has exited

PAUSED

worker is paused

workers_task_id

Returns the tasks that workers are currently processing (or NA for workers that are not known to be working on a task)

Usage: workers_task_id(worker_ids = NULL)

Arguments:

worker_ids

Optional vector of worker identifiers. If omitted all workers known to rrqueue will be used (currently running workers only).

Value: A named character vector. Names are the worker ids and value is the task id, or NA if no task is being worked on.

workers_times

returns a summary of times for a set of workers. This only returns useful information if the workers are running a heartbeat process, which requires the RedisHeartbeat package.

Usage: workers_times(worker_ids = NULL, unit_elapsed = "secs")

Arguments:

worker_ids

Optional vector of worker identifiers. If omitted all workers known to rrqueue will be used (currently running workers only).

unit_elapsed

Unit to use in computing elapsed times. The default is to use "secs". This is passed through to difftime so the units there are available and are "auto", "secs", "mins", "hours", "days", "weeks".

Value: A data.frame, one row per worker, with columns

worker_id

Worker identifier

expire_max

Maximum length of time before worker can be declared missing, in seconds

expire

Time until the worker will expire, in seconds

last_seen

Time since the worker was last seen

last_action

Time since the last worker action

workers_log_tail

Return the last few entries in the worker logs.

Usage: workers_log_tail(worker_ids = NULL, n = 1)

Arguments:

worker_ids

Optional vector of worker identifiers. If omitted all workers known to rrqueue will be used (currently running workers only).

n

Number of log entries to return. Use 0 or Inf to return all entries.

Value:

A data.frame with columns

worker_id

the worker identifier

time

time of the event

command

the command (e.g., MESSAGE, ALIVE)

message

The message associated with the command

workers_info

Returns a set of key/value information about workers. Includes things like hostnames, process ids, environments that can be run, etc. Note that this information is from the last time that the worker process registered an INFO command. This is registered at startup and after recieving a INFO message from a queue object. So the information may be out of date.

Usage: workers_info(worker_ids = NULL)

Arguments:

worker_ids

Optional vector of worker identifiers. If omitted all workers known to rrqueue will be used (currently running workers only).

Value: A list, each element of which is a worker_info

worker_envir

Returns an up-to-date list of environments a worker is capable of using (in contrast to the entry in workers_info that might be out of date.

Usage: worker_envir(worker_id)

Arguments:

worker_id

Single worker identifier


traitecoevo/rrqueue documentation built on May 31, 2019, 7:44 p.m.