MALDIquant-parallel | R Documentation |
MALDIquant
offers multi-core support using
mclapply
and mcmapply
. This
approach is limited to unix-based platforms.
Please note that not all functions benfit from parallelisation. Often the
overhead to create/copy objects outrun the time saving of parallel runs. This
is true for functions that are very fast to compute (e.g.
sqrt
-transformation). That's why the default value for the
mc.cores
argument in all functions is 1L
.
It depends on the size of the dataset which step (often only
removeBaseline
and
detectPeaks
) benefits from parallelisation.
In general it is faster to encapsulate the complete workflow into a function
and parallelise it using mclapply
instead of using the
mc.cores
argument of each method. The reason is the reduced overhead
for object management (only one split/combine is needed instead of doing these
operations in each function again and again).
The following functions/methods support the mc.cores
argument:
trim,list,numeric-method
transformIntensity,list-method
smoothIntensity,list-method
removeBaseline,list-method
calibrateIntensity,list-method
detectPeaks,list-method
alignSpectra
averageMassSpectra
mergeMassPeaks
mclapply
,
mcmapply
## Not run:
## load package
library("MALDIquant")
## load example data
data("fiedler2009subset", package="MALDIquant")
## run single-core baseline correction
print(system.time(
b1 <- removeBaseline(fiedler2009subset, method="SNIP")
))
if(.Platform$OS.type == "unix") {
## run multi-core baseline correction
print(system.time(
b2 <- removeBaseline(fiedler2009subset, method="SNIP", mc.cores=2)
))
stopifnot(all.equal(b1, b2))
}
## parallelise complete workflow
workflow <- function(spectra, cores) {
s <- transformIntensity(spectra, method="sqrt", mc.cores=cores)
s <- smoothIntensity(s, method="SavitzkyGolay", halfWindowSize=10,
mc.cores=cores)
s <- removeBaseline(s, method="SNIP", iterations=100, mc.cores=cores)
s <- calibrateIntensity(s, method="TIC", mc.cores=cores)
detectPeaks(s, method="MAD", halfWindowSize=20, SNR=2, mc.cores=cores)
}
if(.Platform$OS.type == "unix") {
## parallelise the complete workflow is often faster because the overhead is
## reduced
print(system.time(
p1 <- unlist(parallel::mclapply(fiedler2009subset,
function(x)workflow(list(x), cores=1),
mc.cores=2), use.names=FALSE)
))
print(system.time(
p2 <- workflow(fiedler2009subset, cores=2)
))
stopifnot(all.equal(p1, p2))
}
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.