knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Overview

The goal of redisProgress is to create a progress bar system to follow tasks distributed across several R instances (using foreach package for example). It uses Redis instance, a key-value store server https://redis.io/, to follow tasks progress.

As it mainly designed to follow tasks in a distributed settings, this package focuses on following progress of several tasks running in parallel.

Tasks are grouped in a common named queue and can be monitored together on the same screen using a console view. The name of the queue is your choice (in Redis names will be prefixed to reduce the risk of name collision with other systems.)

Usage

The package provides several functions :

Some utility functions:

Creating tasks queue

once created, the progress bar object is used to declare the start of each task and the progress state of each task (basically an integer value representing the number of achieved steps of the running task)

library(redisProgress)

redis = redis_client("rredis") # creating redis client, using rredis with default parameters

# Creating the queue 
progress = create_redis_progress("my-queue-name", redis=redis)

progress$start("job1", steps=100) # Starting job1 in the queue
progress$step(10)
progress$step(100) # Another step value

# Starting another task
progress$start("job2", steps=5) # Starting job1 in the queue
progress$incr() # Simply incrementing task step
progress$message("job2 is running")
progress$incr() # Simply incrementing task step

Using parallel tasks

With parallel tasks, for example with foreach

library(redisProgress)
library(parallel)
library(foreach)
library(doParallel)

registerDoParallel(3)

redis = redis_client("rredis", host="localhost")
progress = create_redis_progress("my-jobs", redis, unique.name=TRUE, publish = "jobs:clement")

data = foreach(job=1:20, .verbose = TRUE, .combine = c, .export = "progress") %dopar% {
    progress$start(paste0("task",job), steps=100)
    for(i in 1:100) {
        # Doing a step of my long running task
        # Sys.sleep(runif(1, min=5, max=30))
        progress$incr() # 
    }
}

stopImplicitCluster()

progress object is propagated across all R workers ensuring the tasks are registered in the same job queue and on the same Redis server.

In this example the parameter unique will create a unique queue name ensuring the queue is unique each time the program is run. The given queue name is used as a prefix the the generated unique queue name.

publish parameter allows to keep the list of started tasks in a redis list. Once a task is started, it is published in the list. The last element of this list is the last run of this script. This is useful when you don't know the queue name (like in this example, the queue name will be generated during it's creation)

Monitoring task progress of a queue

In another R instance, running tasks can be followed using redis_progress_monitor() function.

library(redisProgress)

redis = redis_client("rredis") 

# Known and fixed queue name
redis_progress_monitor(from="my-queue-name", redis=redis)

# Queue name has been generated and published in a list, use the "key" parameter in a list to get the real queue name from the list
redis_progress_monitor(list(key="jobs:clement"), redis=redis)

Redis Backends

redisProgress can handle several packages implementing redis client API

redis_client() creates a client object embedding the connection context, using a unified interface, regardless the redis package used. The connection parameters (host, port,...) will be propagated to the R workers



cturbelin/redisProgress documentation built on Dec. 10, 2019, 7:35 a.m.