split_dataset: Split an HDF5Matrix into multiple block datasets

View source: R/S3_split.R

split_datasetR Documentation

Split an HDF5Matrix into multiple block datasets

Description

Splits an HDF5Matrix into equal-sized sub-matrices stored as separate datasets in the same HDF5 file. This is the preferred form when you want an explicit, unambiguous call: unlike split(), it does not carry the f and drop parameters inherited from base::split that have no meaning for HDF5Matrix objects.

Exactly one of n_blocks or block_size must be provided. Output datasets are named <out_group>/<out_dataset>.0, <out_group>/<out_dataset>.1, ... (0-based index).

Usage

split_dataset(x, n_blocks = NULL, block_size = NULL, bycols = FALSE, ...)

## S3 method for class 'HDF5Matrix'
split_dataset(
  x,
  n_blocks = NULL,
  block_size = NULL,
  bycols = FALSE,
  out_group = "SPLIT",
  out_dataset = NULL,
  overwrite = FALSE,
  ...
)

Arguments

x

An HDF5Matrix.

n_blocks

Integer or NULL. Number of blocks.

block_size

Integer or NULL. Rows or columns per block.

bycols

Logical. Split by columns (TRUE) or rows (default FALSE).

...

Ignored.

out_group

Character. Output HDF5 group (default "SPLIT").

out_dataset

Character or NULL. Base dataset name.

overwrite

Logical. Overwrite existing blocks (default FALSE).

Value

A named list of HDF5Matrix objects.

See Also

split.HDF5Matrix for the base::split() S3 dispatch equivalent; hdf5_reduce to recombine blocks.

Examples


tmp  <- tempfile(fileext = ".h5")
M    <- hdf5_create_matrix(tmp, "data/M", data = matrix(1:60, 6, 10))
blks <- split_dataset(M, n_blocks = 3L)
length(blks)
lapply(blks, close)
close(M)
unlink(tmp)



BigDataStatMeth documentation built on June 8, 2026, 5:07 p.m.