Applies a function to each file in the file set

Share:

Description

Applies a function to each file in the file set.

Usage

1
2
3
4
## S3 method for class 'GenericDataFileSet'
dsApply(ds, IDXS=NULL, DROP=is.null(IDXS), AS=as.list, FUN, ..., args=list(), skip=FALSE,
  verbose=FALSE, .parallel=c("none", "future", "BatchJobs", "BiocParallel::BatchJobs"),
  .control=list(dW = 1))

Arguments

ds, ds1, ds2

GenericDataFileSet:s.

IDXS

A (named) list, where each element contains a vector data set indices, or an integer vector of individual elements. If NULL, then ... with names as of the data set.

DROP

If FALSE, the first argument passed to FUN is always a list of files. If TRUE, an single-index element is passed to FUN as a file instead of as a list containing a single file.

AS

(optional) A function coercing the first set/group object passed.

FUN

A function.

...

Arguments passed to FUN.

args

(optional) A list of additional arguments passed to FUN.

skip

If TRUE, already processed files are skipped.

verbose

See Verbose.

.parallel

A character string specifying what mechanism to use for performing parallel processing, if at all.

.control

(internal) A named list structure controlling the processing.

Value

Returns a named list where the names are those of argument IDXS.

Author(s)

Henrik Bengtsson

See Also

The future, BiocParallel and BatchJobs packages are utilized for parallel/distributed processing, depending on settings.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
## Not run: 
 isPackageInstalled <- R.utils::isPackageInstalled

# - - - - - - - - - - - - - - - - - - - - - - - -
# Setting up a file set
# - - - - - - - - - - - - - - - - - - - - - - - -
path <- system.file(package="R.filesets")
ds <- GenericDataFileSet$byPath(path)


# - - - - - - - - - - - - - - - - - - - - - - - -
# Get the size of each file
# - - - - - - - - - - - - - - - - - - - - - - - -
# Alt 1.
res1 <- lapply(ds, FUN=getFileSize)
print(res1)

# Alt 2. (according to current settings; see package startup message)
res2 <- dsApply(ds, FUN=getFileSize)
print(res2)
stopifnot(identical(res2, res1))

# Alt 3. (via an internal loop)
res2 <- dsApply(ds, FUN=getFileSize, .parallel="none")
print(res2)
stopifnot(identical(res2, res1))

# Alt 4. (via BiocParallel + BatchJobs)
if (isPackageInstalled("BiocParallel") && isPackageInstalled("BatchJobs")) {
  res3 <- dsApply(ds, FUN=getFileSize, .parallel="BiocParallel::BatchJobs")
  print(res3)
  stopifnot(identical(res3, res1))
}


## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.