The pipeComp framework



r Rpackage("pipeComp") is a simple framework to facilitate the comparison of pipelines involving various steps and parameters. It was initially developed to benchmark single-cell RNA sequencing (scRNAseq) pipelines:

pipeComp, a general framework for the evaluation of computational pipelines, reveals performant single-cell RNA-seq preprocessing tools
Pierre-Luc Germain, Anthony Sonrel & Mark D Robinson, Genome Biology 2020, doi: 10.1186/s13059-020-02136-7.

However the framework can be applied to any other context. This vignette introduces the package and framework; for information specifically about the scRNAseq pipeline and evaluation metrics, see the pipeComp_scRNA vignette. For a completely different example, with walkthrough the creating of a new PipelineDefinition, see the pipeComp_dea vignette.


Install using:


Because r Rpackage("pipeComp") was meant as a general pipeline benchmarking framework, we have tried to restrict the package's dependencies to a minimum. To use the scRNA-seq pipeline and wrappers, however, requires further packages to be installed. To check whether these dependencies are met for a given PipelineDefinition and set of alternatives, see ?checkPipelinePackages.

pipeComp overview

knitr::include_graphics(system.file('docs', 'pipeComp_scheme.png', package = 'pipeComp'),)

r Rpackage("pipeComp") is built around the S4 class PipelineDefinition which defines a basic abstract workflow with different steps (which can have any number of parameters, including subroutines), as depicted in Figure 1A. When a pipeline is executed, each step is consecutively applied to a starting object. Importantly, the step functions are not related to specific methods, but should be generic, whereas the methods to be compared are passed as parameters (e.g. as the name of wrapper functions defined separately from the pipeline). Often, the step function will simply apply the wrapper to the object and do no more. In addition, each step can optionally have evaluation functions, which take the output of the step as an input, and outputs evaluation metrics. Finally, functions can be specified to aggregate these metrics across multiple datasets (in case the evalation output isn't a data.frame or list of data.frames).

To run a benchmark one in addition needs a set of alternative values to the parameters. For example, alternative values for the parameters depicted in Figure 1 could be defined as follows:

alternatives <- list(
  par1=c("function_A", "function_B"),
  # ...

Each parameter (slot of the list) can take any number of alternative scalar values (e.g. character, numeric, logical). The name of functions, like function_A and function_B (which should be loaded in the environment), can be passed if the PipelineDefinition allows it. This means that wrappers should be created for distinct methods, and the name of the wrapper functions passed as alternative parameter values.

Given a PipelineDefinition, a list of alternative parameters and a list benchmark datasets, the runPipeline (Figure 1B) function proceeds through all combinations arguments, avoiding recomputing the same step (with the same parameters) twice and compiling evaluations on the fly to avoid storing potentially large intermediate data:

res <- runPipeline(datasets, alternatives, pipelineDef=PipelineDefinition)

Aggregated evaluation metrics for each combination of parameters, along with computing times, are returned as \Rclass{SimpleList}s with the following structure:

In addition to the (aggregated) output returned by the function, runPipeline will produce at least one RDS file (saved according to the output.prefix argument) per dataset: the *.evaluation.rds contain the (non-aggregated) evaluation results at each step, while the .endOutputs.rds files (assuming saveEndResults=TRUE) contain the final output of each combination (i.e. the output of the final step).

The r Rpackage("pipeComp") package includes a PipelineDefinition for single-cell RNA sequencing (scRNAseq) data. For more information about this application and examples of real outputs, see the pipeComp_scRNA vignette.

Running only a subset of the combinations

Rather than running all possible combinations of parameters, one can run only a subset of them through the comb parameter of runPipeline. The parameter accepts either a matrix (of argument indices) or data.frame (of factors) which can be built manually, but the simplest way is to first create all combinations, and then get rid of the undesired ones:

comb <- buildCombMatrix(alternatives)

And then we could remove some combinations before passing the argument to runPipeline:

comb <- comb[ (comb$par1 != "function_A" | comb$par2 == 2) ,]
res <- runPipeline( datasets, alternatives, pipelineDef=PipelineDefinition, 
                    nthreads=3, comb=comb )

Dealing with the PipelineDefinition object

Creating a PipelineDefinition

The PipelineDefinition object is, minimally, a set of functions consecutively executed on the output of the previous one, and optionally accompanied by evaluation and aggregation functions. A simple pipeline can be constructed as follows:

my_pip <- PipelineDefinition( list( 
  step1=function(x, param1){
    # do something with x and param1
  step2=function(x, method1, param2){
    get(method1)(x, param2) # apply method1 to x, with param2
  step3=function(x, param3){
    x <- some_fancy_function(x, param3)
    # the functions can also output evaluation through the `intermediate_return` slot:
    e <- my_evaluation_function(x)
    list( x=x, intermediate_return=e)

The PipelineDefinition can also include descriptions of each step or evaluation and aggregation functions. For example:

my_pip <- PipelineDefinition( 
  list( step1=function(x, meth1){ get(meth1)(x) },
        step2=function(x, meth2){ get(meth2)(x) } ),
  evaluation=list( step2=function(x) c(mean=mean(x), max=max(x)) ),
  description=list( step1="This steps applies meth1 to x.",
                    step2="This steps applies meth2 to x.")

Running it with dummy data and functions:

datasets <- list( ds1=1:3, ds2=c(5,10,15))
alternatives <- list(meth1=c("log","sqrt"), meth2="cumsum")
tmpdir1 <- paste0(tempdir(),"/")
res <- runPipeline(datasets, alternatives, my_pip, output.prefix=tmpdir1)

Computing times can be accessed through res$elapsed; they can either be accessed as the pipeline total for each combination, or in a step-wise fashion. They can also be plotted using plotElapsed (see below). Evaluation results can be accessed through res$evaluation and plotted with evalHeatmap (see below).

Manipulating a PipelineDefinition

A number of generic methods are implemented for PipelineDefinition objects, including show, names, length, [, as.list. This means that, for instance, a step can be removed from a pipeline in the following way:


Steps can be added using the addPipelineStep function:

pip2 <- addPipelineStep(my_pip, name="newstep", after="step1")

Functions for the new step can be specified through the slots argument of addPipelineStep or afterwards through stepFn:

stepFn(pip2, "newstep", type="function") <- function(x, newparam){
  do_something(x, newparam)

Finally, the arguments() method can be used to extract the arguments for each step, and the defaultArguments methods can be used to get or set the default arguments.

Merging results of different runPipeline calls

The previous call to runPipeline produced one evaluation file for each dataset:

list.files(tmpdir1, pattern="evaluation\\.rds")

Their aggregation could be repeated manually by reading those files and calling aggregatePipelineResults:

ds <- list.files(tmpdir1, pattern="evaluation\\.rds", full.names = TRUE)
names(ds) <- basename(ds)
res1 <- readPipelineResults(resfiles=ds)
res <- aggregatePipelineResults(res1)

In addition, the results of different calls to runPipeline (with partially different datasets and/or parameter combinations) can be merged together, provided that the PipelineDefinition is the same. To illustrate this, we first make another runPipeline call using slightly different alternative parameter values:

alternatives <- list(meth1=c("log2","sqrt"), meth2="cumsum")
tmpdir2 <- paste0(tempdir(),"/")
res <- runPipeline(datasets, alternatives, my_pip, output.prefix=tmpdir2)

We then load the (non-aggregated) results of each run from the files, and merge them using mergePipelineResults:

res1 <- readPipelineResults(tmpdir1)
res2 <- readPipelineResults(tmpdir2)
res <- mergePipelineResults(res1,res2)

We can then aggregate the results as we do for a single run:

res <- aggregatePipelineResults(res)

Handling errors


Errors typically come from the method wrappers, and debugging them from within runPipeline can be complicated by multithreading. For this reason runPipeline wraps additional information around error messages. This would be an example of the error report:

Error in dataset `mydataset` with parameters:
filt=lenient, norm=norm.test, k=20
in step `normalization`, evaluating command:
`pipDef[["normalization"]](x = x, norm = "norm.test")`
Error in get(norm)(x): not enough cells!

The offending dataset, pipeline step and exact set of parameters is reported, enabling you to try to repeat the error outside, where it is easier to debug.

Skipping errors and fixing them afterwards

A useful behavior is for runPipeline to continue when encountering errors, so that successful results are not lost. This behavior can be enabled with the skipErrors=TRUE argument. We can illustrate this with our above dummy pipeline, by adding an alternative that is bound to give rise to an error: <- function(x){
  if(any(x>10)) stop("Too big!")
alternatives$meth1 <- c(alternatives$meth1, "")
res <- runPipeline(datasets, alternatives, my_pip, output.prefix=tmpdir1, skipErrors=TRUE)

As indicated in the error message, only the analyses successful across all datasets were aggregated:


However, all successful results (e.g. including '' on 'ds1') were saved in the non-aggregated files. This means that one can then correct the problematic function, re-run only the analyses that failed (i.e. only on 'ds2' in this case), and merge the results with the first batch (see ?mergePipelineResults).

Plotting results

r Rpackage("pipeComp") has two general plotting functions which can be used with any PipelineDefinition: evalHeatmap and plotElapsed. In addition, the package includes pipeline-specific plotting functions (see the pipeComp_scRNA and pipeComp_dea vignettes).

Running times

Running times can be plotted with the plotElapsed function:


In this case we show all tested combinations, but in most cases the argument will be used to aggregate the results by parameters of interest.

Evaluation results can be plotted through the evalHeatmap function:

evalHeatmap(res, what=c("mean", "max"))

By default, the heatmap prints the actual metric values, while the colors is mapped to scaled data. Specifically, pipeComp map colors to the signed square-root of the number of (matrix-wise) Median Absolute Deviations from the (column-wise) median. We found this mapping to be particularly useful because differences in color have comparable meaning across datasets, and the color-scale is not primarily capturing outliers or baseline differences between datasets. This behavior can be changed with the scale parameter of the function. For more complex examples of using evalHeatmap, see its usage in the pipeComp_scRNA vignette, or refer to the function's help.

For a more complex, 'real-life' example of the creation of a PipelineDefinition, see the pipeComp_dea vignette. For a complex example of evaluation outputs, see the pipeComp_scRNA vignette.

Try the pipeComp package in your browser

Any scripts or data that you put into this service are public.

pipeComp documentation built on Nov. 8, 2020, 7:35 p.m.