motus.org

The hardware server (currently sgdata.motus.org) that processes raw files from receivers and generates runs of tag detections runs various software servers that are R applications based on the motusServer package.

This is an sqlite DB that tracks (re) processing of uploaded, sync'd, or archived data. The main table is jobs, with this schema:

CREATE TABLE jobs (
 id INTEGER UNIQUE PRIMARY KEY NOT NULL, -- unique job id
 pid INTEGER REFERENCES jobs (id), -- id of parent job; null if this is a "top-level" job
 stump INTEGER REFERENCES jobs (id), -- id of top-level job; i.e. ultimate ancestor of this job. Equal to `id` if `pid` is null
 ctime FLOAT(53), -- timestamp of job creation
 mtime FLOAT(53), -- timestamp of latest change to job information
 type TEXT , -- short string giving type of job.  If type is 'abcdEfg', the job will be handled by a function called 'handleAbcdEfg'
done INTEGER , -- status code.  0: not completed (maybe not started); 1: completed successfully; < 0: error
queue TEXT , -- usually a small integer; queue in which job resides.
 -- This indicates which running processServer instance is processing or has processed this job.
path TEXT , -- filesystem path to job folder, which holds archives or data files used for this job; null if none
oldpath TEXT , -- filesystem path to previous location of job folder; permits recovery in case of crash between time
-- of attempt to move job and recording thereof in the DB
 data JSON, -- parameters, logs, product pointers for this job, as a json-encoded object.  Names of fields generated by the job end in `_`
 motusUserID, -- integer; motus ID of user who launched this job; only non-null in top-level jobs
 motusProjectID -- integer; motus ID of project (selected by user at upload time) which will own the outputs from this job; only non-null in top-level jobs
 )

R accesses this database via an S3 class called Copse, a simple data-base-backed object interface. Jobs are represented as Twigs in the Copse, with a tree structure (subjobs within jobs) and arbitrary data fields for parameters and output products.

Writing to the R objects makes immediate changes to the fields in the jobs table, and reading from the R objects uses the most recently.

A top-level job is created by these events: - user uploads an archive of raw receiver files to be processed - data server polls an attached receiver for new data (typically hourly) - admin requests a re-run of some portion of archived raw files

Each top-level job creates subjobs that perform chunks of the processing. These pieces were chosen in somewhat arbitrary fashion, but with these goals: - if a chunk fails, it should leave the DB and filesystem in a state where the chunk can be retried, in case a bug is fixed - if processing is interrupted during a chunk (e.g. power outage, system crash, fatal bug), retrying it should work - chunks should be conceptually independent, to the extent possible - chunks that require locking objects (such as receiver databases) should be as small as possible

Top-level jobs are created in either the regular queue /sgm/queue/0, or the priority queue /sgm/priority. By default, uploads go into the former, and sync jobs into the latter. Jobs (re)submitted by admin users can be forced into either queue.

From these two top-level queues, one of the processServers will claim the job. We've typically been running 4 normal processServers that claim jobs from queue 0, and two 'high-priority' processServers that claim jobs from the priority queue. There is nothing different about high-priority servers except for the queue from which they are fed. These are intended to allow low-latency processing of data from attached receivers, frequently and in small quantities. Upload jobs, which might involve very large amounts of data and so take a long time to process, are run on the normal processServers so as not to disrupt the low-latency processing.

Once a top-level job has entered a queue, any subjobs it generates are automatically added to the same queue.

Top-level jobs are represented in the filesystem by a folder whose name is a left-zero-padded number equal to the jobID, e.g. 00000001 Currently, the numbers are padded to 8 digits, allowing for 100 M jobs. That could be changed if needed.

Jobs begin life in one of the input queues:

/sgm/queue/0 (normal priority jobs)
/sgm/priority (high priority jobs)

So, e.g., a new upload might begin with the folder /sgm/queue/0/00012345 containing the uploaded file.

When a processServer is available, it looks at its input queue and claims the first job it finds there, waiting if there are none. (Really, it does blocking reads from a pipe connected to an instance of inotifywait, which watchs a folder for file creation or move events).

The job is then moved to the processServer's processing queue:

/sgm/queue/1, /sgm/queue/2, ... /sgm/queue/4 (normal priority processServers)
/sgm/queue/101, /sgm/queue/102 (for the two high-priority processServers)

When a processServer is started, it checks its processing queue for any unfinished jobs (done == 0) and runs those, before looking at its input queue. This allows for resumption of jobs interrupted by a server outage.

When a job (and all of its subjobs) has completed, its folder is moved to /sgm/done. This is currently a flat folder, but needs to be re-organized hierarchically to properly support huge numbers of jobs.

Any job (including subjob) that ends in an error has its stack dump recorded in /sgm/errors as an .rds file, e.g. /sgm/errors/00001270.rds This file can be examined within R by doing:

> library(motusServer)
> hackError(1270, topLevel=FALSE)
With:
 Error in pushToMotus(src): invalid motus device ID for receiver with DB at /mnt/usb/new_sgm_recv/SG-0613BB000593.motus

Traceback (also is in variable bt):
bt[[3]]: h(j)

bt[[2]]: pushToMotus(src)

bt[[1]]: stop("invalid motus device ID for receiver with DB at ", attr(src$con, "dbn

## the bt list holds environments with variables at each level of the stack dump
> ls(bt[[2]])
[1] "batches"    "con"        "deviceID"   "motusTX"    "newBatches"
[6] "sql"        "src"
> bt[[2]]$newBatches
# A tibble: 1 x 10
  batchID motusDeviceID monoBN         tsStart      tsEnd numHits
    <int>         <int>  <int>           <dbl>      <dbl>   <int>
1       1            NA      8 1370809071.1776 1372964178       0
# ... with 4 more variables: ts <dbl>, motusUserID <int>, motusProjectID <int>,
#   motusJobID <int>

Not all variables in the stack dump environments will be valid; e.g. database and file connections will not be.

jbrzusto/motusServer documentation built on May 19, 2019, 8:19 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com