H5Apply: Apply function to chunks of H5 data in ligerDataset object
In rliger: Linked Inference of Genomic Experimental Relationships

H5Apply

R Documentation

Apply function to chunks of H5 data in ligerDataset object

h5 calculation wrapper, that runs specified calculation with on-disk matrix in chunks

H5Apply(
  object,
  FUN,
  init = NULL,
  useData = c("rawData", "normData"),
  chunkSize = 1000,
  verbose = getOption("ligerVerbose"),
  ...
)

`object`	A ligerDataset object.
`FUN`	A function that is applied to each chunk. See detail for restrictions.
`init`	Initialized result if it need to be updated iteratively. Default `NULL`.
`useData`	The slot name of the data to be processed. Choose from `"rawData"`, `"normData"`, `"scaleData"`. Default `"rawData"`.
`chunkSize`	Number if columns to be included in each chunk. Default `1000`.
`verbose`	Logical. Whether to show information of the progress. Default `getOption("ligerVerbose")` which is `TRUE` if users have not set.
`...`	Other arguments to be passed to `FUN`.

The FUN function has to have the first four arguments ordered by:

chunk data: A sparse matrix (dgCMatrix-class) containing maximum chunkSize columns.
x-vector index: The index that subscribes the vector of x slot of a dgCMatrix, which points to the values in each chunk. Mostly used when need to write a new sparse matrix to H5 file.
cell index: The column index of each chunk out of the whole original matrix
Initialized result: A customized object, the value passed to H5Apply(init) argument will be passed here in the first iteration. And the returned value of FUN will be iteratively passed here in next chunk iterations. So it is important to keep the object structure of the returned value consistent with init.