Job: Constructs a job object from a Task and keyed data.table of...
In mskilab/Flow: Workflow and task management for genomics pipelines.

View source: R/Flow.R

Job	R Documentation

Constructs a job object from a Task and keyed data.table of one or more entities. Job instantiation combines the Task configuration with job specific info to create $cmd, $bcmd, $qcmd to run job locally and submit to LSF / SGE and output to task / entity specific output directories, The job object can be polled to examine job status, and run or re-run specific jobs. A Job object is instantiated from the combination of a text .task file containing the task configuration and a keyed data.table of entities (eg samples, pairs, individuals) Instantation involves populating the task with all the relevant columns of the entitie table, Job instantiation requires a Task object or path to a .task file as input + and keyed "entities" data table with all the necessary input columns required by the task configuration. If any of these columns don't exist then the Job object will fail to be instantiated. Optional input rootdir specifies root directory of all job specific output directories. All outputs for jobs will be written to directories that are named by the task name / row key. These directories are created at time of object instantiation. An .rds of the job object is stored in a standard location in the rootdir (with name TASKNAME.rds)

Description

Constructs a job object from a Task and keyed data.table of one or more entities.

Job instantiation combines the Task configuration with job specific info to create $cmd, $bcmd, $qcmd to run job locally and submit to LSF / SGE and output to task / entity specific output directories, The job object can be polled to examine job status, and run or re-run specific jobs.

A Job object is instantiated from the combination of a text .task file containing the task configuration and a keyed data.table of entities (eg samples, pairs, individuals) Instantation involves populating the task with all the relevant columns of the entitie table,

Job instantiation requires a Task object or path to a .task file as input + and keyed "entities" data table with all the necessary input columns required by the task configuration. If any of these columns don't exist then the Job object will fail to be instantiated. Optional input rootdir specifies root directory of all job specific output directories. All outputs for jobs will be written to directories that are named by the task name / row key. These directories are created at time of object instantiation. An .rds of the job object is stored in a standard location in the rootdir (with name TASKNAME.rds)

Usage

Job(
  task,
  entities,
  rootdir = "./Flow/",
  queue = NA_character_,
  qos = NA_character_,
  gres = NA_character_,
  mem = NA_character_,
  nice = NULL,
  nice_val = 10,
  cores = 1,
  mock = FALSE,
  update_cores = 1,
  parse_recursive = FALSE,
  check.stamps = TRUE,
  io_c = 2,
  io_n = 4,
  qprior = 0,
  time = "3-00",
  ...
)

Arguments

`task`	task config (.task file) or Task object
`entities`	keyed data table of entities that contain annotations which task will be drawing from
`rootdir`	the root directory under which Task specific output will be placed (default ./Flow)
`queue`	which queue to direct jobs to (should be compatible with local LSF or SGE HPC)
`mem`	memory limit to put on jobs (in GB)
`nice`	whether to nice jobs when running locally (default = TRUE)
`cores`	how many cores to run jobs with
`mock`	boolean, if FALSE will not create subdirectories