chowParallel: Calls the superNOVA pipeline, splitting input matrix into...

View source: R/chowParallel.R

chowParallelR Documentation

Calls the superNOVA pipeline, splitting input matrix into multiple batch jobs on an HPC.

Description

Evaluates differential coexpression between two or more subgroups of samples in the data versus the global model, using multiple nodes in parallel in a batch environment.

Usage

chowParallel(inputMat, design, outputFile, compare = NULL,
  sigOutput = FALSE, sigThresh = 0.05, verbose = FALSE,
  corrType = "pearson", perBatch = 10, coresPerJob = 2,
  timePerJob = 60, memPerJob = 2000,
  batchConfig = system.file("config/batchConfig_Local.R", package =
  "superNOVA"), batchDir = "batchRegistry", batchWarningLevel = 0,
  batchSeed = 12345, maxRetries = 3, testJob = FALSE,
  chunkSize = 1)

Arguments

inputMat

The matrix (or data.frame) of values (e.g., gene expression values from an RNA-seq or microarray study) that you are interested in analyzing. The rownames of this matrix should correspond to the identifiers whose correlations and differential correlations you are interested in analyzing, while the columns should correspond to the rows of the design matrix and should be separable into your compare.

design

A standard model.matrix created design matrix. Rows correspond to samples and colnames refer to the names of the conditions that you are interested in analyzing. Only 0's or 1's are allowed in the design matrix. Please see vignettes for more information.

outputFile

Location to save the output. Required.

compare

Vector of two character strings, each corresponding to one group name in the design matrix, that should be compared.

sigOutput

Should we save the significant results in a separate file? Default = FALSE.

sigThresh

This numeric value specifies the p-value threshold at which a differential correlation p-value is deemed significant for differential correlation class calculation. Default = 1, as investigators may use different cutoff thresholds; however, this can be lowered to establish significant classes as desired.

verbose

Option indicating whether the program should give more frequent updates about its operations. Default = FALSE.

corrType

The correlation type of the analysis, limited to "pearson","spearman",or "bicor". Default = "pearson".

perBatch

Number of times to split the features of the input data into separate batches. A higher number creates a larger number of jobs, but may be less uniform. Default = 10.

coresPerJob

Number of cores to use on each batch job run. Default = 2.

timePerJob

Walltime to request for each batch job (e.g. in a HPC cluster), in minutes. Default = 60

memPerJob

Memory to request for each batch job (e.g. in a HPC cluster), in MB. Default = 2000

batchConfig

Location of the batchtools configuration file (e.g. to configure this tool to work with your HPC cluster). Defaults to one used at inst/config/batchConfig_Zhang.R.

batchDir

Location to store temporary files, logs, and results of the batch run. This is the registry for the batchtools R package. Default = batchRegistry/

batchWarningLevel

Warning level on remote nodes during chowCor calculation (equivalent to setting options(warn=batchWarningLevel). Default = 0.

batchSeed

Random seed to use on all batch jobs. Default = 12345.

maxRetries

Number of times to re-submit jobs that failed. This is helpful for jobs that failed due to transient errors on an HPC. Default = 3

testJob

Test one job before running it? Default = FALSE

chunkSize

Execute multiple splits sequentially on each node. Default = 1 (false)

Value

Returns whether all jobs successfully executed or not. Output is in the output file.


ryananeff/superNOVA documentation built on March 29, 2024, 5:31 p.m.