Through the lifecycle of a task:
Assuming a queue name rrq
, so all keys begin with rrq:
expr
(some unevaluted R expression) and envir
(an environment to locate local variables) we do some processing to create something that can be serialised by Redis (see below)rqr:tasks:counter
) to get the task idrrq::tasks:expr
rrq:tasks:envir
rrq:tasks:status
rrq:tasks:time:sub
rrq:tasks:complete
rrq:tasks:id
rrq:tasks:id
rrq:workers:status
)rrq:workers:task
rrq:tasks:worker
rrq:tasks:times:beg
rrq:tasks:status
(note that the task can be lost between steps 1 and 3 though it's easy enough to work out that this is the case because the task is still known to the system but will not be associated with any worker).
rrq:tasks:result
rrq:tasks:status
rrq:tasks:times:end
rrq:workers:status
rrq:tasks:complete
rrq:tasks:id
for new jobs<task_id>:<object_name>
.The serialisation to Redis via the object cache uses a content-addressable system where objects are actually stored against keys that are the hash of the object contents, and then a pointer is stored from the mangled name to the object. There's also a counter in there that indicates how many things point at a given piece of data - once that counter drops to zero the object is deleted. This means that if there are regularly referenced large pieces of data, they are only stored occasionally.
On the recieving end, the workers only pull the objects from the database if they have not pulled an object with that hash before. This avoids the bottleneck of pulling over the network and deserialising.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.