run_PBSpro: Submit array jobs to PBSpro clusters.

View source: R/pbspro.R

run_PBSproR Documentation

Submit array jobs to PBSpro clusters.

Description

This function submits array jobs to PBSpro clusters.

The input has to be a function that can carry out a full computation itself, plus a data.frame where each row represents the inputs that this function is expecting. The input data.frame is dumped to a file, the input function is wrapped inside an automatically generated R script that gathers inputs from the command line. A PBSpro submission script is generated for a bash shell. The function can also run the PBSpro command 'sbatch' to submit the job, or just generate the required files and prompt the user to submit the job via the shell. PBSpro parameters can be provided as a list of parameters, similarly modules and custom filenames for the generated scripts.

Usage

run_PBSpro(
  FUN,
  PARAMS,
  QSUB_config = default_QSUB_config(),
  modules = c("R/4.1.0"),
  extra_commands = NULL,
  input_file = "EASYPAR_PBSpro_input_jobarray.csv",
  R_script = "EASYPAR_PBSpro_Run.R",
  Submission_script = "EASYPAR_PBSpro_submission.sh",
  output_folder = ".",
  per_task = 1,
  N_simultaneous_jobs = NULL,
  run = FALSE
)

Arguments

FUN

A function that takes any arguments in input, and performs a computation. This function should be runnable as a standalone R script.

PARAMS

A data.frame where each row represents inputs for FUN. An array job with as many rows as PARAMS is generated.

QSUB_config

A list of QSUB commands for the PBSpro cluster should be provided. The default input is obtained from a call to default_QSUB_config(). The queue and the project ID should always be provided as they are cluster-specific. Otherwise, default values will prompt errors submitting the job.

modules

A list of modules that will be added as dependencies of the PBSpro submission script. For instance modules = 'R/3.5.0' will generate the dependecy for a specific R version as "module load R/3.5.0".

extra_commands

Extra set of commands that will be executed in the submission script right after modules declaration.

input_file

The name of the data.frame input file that is generated from PARAMS. This file contains no header, and no row names.

R_script

The name of the R script file that contains the definition of FUN, and some other autogenerated R code to call the function with input parameters from the command line. Function FUN is given a fake name in this script.

Submission_script

The name of the PBSpro script file that contains the submission routines.

output_folder

The output of thsi function will be sent to this folder.

run

If 'TRUE', the function all attempt invoking 'QSUB' and submit the array jobs. Otherwise it will print to screen the instructions to run the job manually through the console.

Value

Nothing, this funciton just generates the required inputs to submit an array job via the PBSpro clusters. If required, it also attempts submitting the jobs.

Note

The queue and the project ID in 'QSUB_config' should always be provided as they are cluster-specific. Default values will prompt errors submitting the job. Besides, we have found that automatic job submission can sometimes generate some 'command not found' types of errors. Manual submission seems generally the safest option to submit PBSpro jobs.

See Also

See default_QSUB_config that is used to generate default parameters for PBSpro jobs.

Examples

# very dummy example function
FUN = function(x, y){ print(x, y) }

# input for 25 array jobs
PARAMS = data.frame(x = runif(25), y = runif(25))

## Not run: 
# call - not run since it's cluster-specific
run_PBSpro(FUN, PARAMS)

## End(Not run)

caravagn/easypar documentation built on June 4, 2022, 4:25 a.m.