Job: Constructs a job object from a Task and keyed data.table of...

View source: R/Flow.R

JobR Documentation

Constructs a job object from a Task and keyed data.table of one or more entities. Job instantiation combines the Task configuration with job specific info to create $cmd, $bcmd, $qcmd to run job locally and submit to LSF / SGE and output to task / entity specific output directories, The job object can be polled to examine job status, and run or re-run specific jobs. A Job object is instantiated from the combination of a text .task file containing the task configuration and a keyed data.table of entities (eg samples, pairs, individuals) Instantation involves populating the task with all the relevant columns of the entitie table, Job instantiation requires a Task object or path to a .task file as input + and keyed "entities" data table with all the necessary input columns required by the task configuration. If any of these columns don't exist then the Job object will fail to be instantiated. Optional input rootdir specifies root directory of all job specific output directories. All outputs for jobs will be written to directories that are named by the task name / row key. These directories are created at time of object instantiation. An .rds of the job object is stored in a standard location in the rootdir (with name TASKNAME.rds)

Description

Constructs a job object from a Task and keyed data.table of one or more entities.

Job instantiation combines the Task configuration with job specific info to create $cmd, $bcmd, $qcmd to run job locally and submit to LSF / SGE and output to task / entity specific output directories, The job object can be polled to examine job status, and run or re-run specific jobs.

A Job object is instantiated from the combination of a text .task file containing the task configuration and a keyed data.table of entities (eg samples, pairs, individuals) Instantation involves populating the task with all the relevant columns of the entitie table,

Job instantiation requires a Task object or path to a .task file as input + and keyed "entities" data table with all the necessary input columns required by the task configuration. If any of these columns don't exist then the Job object will fail to be instantiated. Optional input rootdir specifies root directory of all job specific output directories. All outputs for jobs will be written to directories that are named by the task name / row key. These directories are created at time of object instantiation. An .rds of the job object is stored in a standard location in the rootdir (with name TASKNAME.rds)

Usage

Job(
  task,
  entities,
  rootdir = "./Flow/",
  queue = as.character(NA),
  qos = as.character(NA),
  mem = NULL,
  nice = NULL,
  cores = 1,
  mock = FALSE,
  update_cores = 1,
  parse_recursive = FALSE,
  check.stamps = TRUE,
  time = "3-00",
  ...
)

Arguments

task

task config (.task file) or Task object

entities

keyed data table of entities that contain annotations which task will be drawing from

rootdir

the root directory under which Task specific output will be placed (default ./Flow)

queue

which queue to direct jobs to (should be compatible with local LSF or SGE HPC)

mem

memory limit to put on jobs (in GB)

nice

whether to nice jobs when running locally (default = TRUE)

cores

how many cores to run jobs with

mock

boolean, if FALSE will not create subdirectories

Author(s)

Marcin Imielinski


mskilab/Flow documentation built on Jan. 12, 2023, 8:31 a.m.