processDatasetsInParallel: Process Datasets in Parallel

View source: R/processFile.R

processDatasetsInParallelR Documentation

Process Datasets in Parallel

Description

Convenience function to run simulation study in parallel on a single machine.

Usage

processDatasetsInParallel(
  datasets,
  path,
  baseFilename,
  fittingFunctions,
  chunkSize,
  saveFitted = FALSE,
  checkProcessed = FALSE,
  createMinimalSaveFile = FALSE,
  ncores = 1,
  clusterType = "PSOCK",
  ...
)

Arguments

datasets

dataset list generated by one of the generate functions.

path

path to save the datasets to.

baseFilename

filename to use, without extension.

fittingFunctions

vector of fitDatasets functions that should be applied to each dataset.

chunkSize

number of datasets to process together in a single job.

saveFitted

logical, if true, the raw fits are also stored.

checkProcessed

logical, if true, will check whether the contents of the processed output is reproduced for the first dataset. This is useful to ensure that everything is still working as expected without having to re-run the whole simulation study.

createMinimalSaveFile

logical, if true, will create a file with the processed results of the first three datasets. This is helpful if one wants to store only the final aggregated results but still wants to make sure that the full code works as expected.

ncores

number of cores to use in processing, if set to 1, datasets are processed in the current R session. Use detectCores to find out how many cores are available on your machine.

clusterType

type of cluster to be created, passed to makeCluster.

...

passed on to processFit. Use this to control what to save.

Details

The merged results are saved in a file taking the name <path>/<baseFilename>-processed.Rdata. You can delete the intermediate result files with the numbers (the chunk index) in the name.

To run on multiple machines, use saveDatasets to save datasets into multiple files. Then call processFile on each of them on the designated machine. Finally, load and merge the results together using loadAndMergePartialResults.

Value

The list of all processed results merged together.

To help reproduciblility, the output of toLatex(sessionInfo(), locale = FALSE) is stored in the sessionInfo attribute.

Author(s)

Manuel Koller

See Also

saveDatasets, processFile


kollerma/robustlmm documentation built on Jan. 14, 2024, 2:18 a.m.