hdf5_apply: Apply a mathematical operation to multiple HDF5 datasets

View source: R/S3_standalone.R

hdf5_applyR Documentation

Apply a mathematical operation to multiple HDF5 datasets

Description

Standalone function that applies a predefined algebraic or statistical operation to a list of datasets stored in an HDF5 group, writing results to outgroup. No open HDF5Matrix object is required.

This is not equivalent to base::apply(). It dispatches built-in C++ operations (QR, cross-product, Cholesky, etc.) to a batch of named datasets in the file.

Usage

hdf5_apply(
  filename,
  group,
  datasets,
  func,
  outgroup,
  b_group = NULL,
  b_datasets = NULL,
  overwrite = FALSE,
  transp_dataset = FALSE,
  transp_bdataset = FALSE,
  fullMatrix = FALSE,
  byrows = FALSE,
  threads = NULL
)

Arguments

filename

Path to the HDF5 file.

group

Group path containing datasets.

datasets

Character vector of dataset names to process.

func

Character. Operation to apply (see Details).

outgroup

Character. Output group path for results.

b_group

Character or NULL. Group of B datasets.

b_datasets

Character vector or NULL. Names of B datasets.

overwrite

Logical. Overwrite existing output datasets.

transp_dataset

Logical. Transpose A datasets before operation.

transp_bdataset

Logical. Transpose B datasets before operation.

fullMatrix

Logical. Return full matrix (not triangular).

byrows

Logical. Apply by rows (for normalize/sdmean).

threads

Integer or NULL. OpenMP threads.

Details

Use hdf5_apply() when you only have the file path and group name. If you already have an open HDF5Matrix, use apply_function(x, ...) instead — both produce the same result.

"CrossProd"

Compute A^T A for each dataset.

"tCrossProd"

Compute A A^T for each dataset.

"CrossProd_double"

Double-precision cross-product.

"tCrossProd_double"

Double-precision transposed cross-product.

"blockmult"

Block-wise A \times B (requires b_datasets).

"QR"

QR decomposition for each dataset.

"invChol"

Inverse via Cholesky for each dataset.

"solve"

Solve AX = B (requires b_datasets).

"normalize"

Column-wise normalization.

"sdmean"

Compute SD and mean.

"descChol"

Cholesky decomposition.

Value

Invisibly NULL. Results written to outgroup. Open them with hdf5_matrix().

See Also

apply_function for the S3 method on an open HDF5Matrix; hdf5_reduce for group-level reduction.

Examples


tmp <- tempfile(fileext = ".h5")
A <- hdf5_create_matrix(tmp, "inp/A", data = matrix(rnorm(25), 5, 5))
B <- hdf5_create_matrix(tmp, "inp/B", data = matrix(rnorm(25), 5, 5))
hdf5_apply(tmp, group = "inp", datasets = c("A", "B"), 
func = "CrossProd", outgroup = "out")
res <- list_datasets(tmp)
res
res_A <- hdf5_matrix(tmp, res[3])
dim(res_A)   # 5 x 5
close(res_A)
hdf5_close_all()
unlink(tmp)



BigDataStatMeth documentation built on June 8, 2026, 5:07 p.m.