split.HDF5Matrix: Split an HDF5Matrix into a list of blocks

View source: R/S3_operations.R

split.HDF5MatrixR Documentation

Split an HDF5Matrix into a list of blocks

Description

S3 method of base::split() for HDF5Matrix objects. Divides the matrix into equal-sized blocks along rows (default) or columns, storing each block as a separate dataset in the same HDF5 file.

Provide exactly ONE of n_blocks or block_size.

Usage

## S3 method for class 'HDF5Matrix'
split(
  x,
  f = NULL,
  drop = FALSE,
  n_blocks = -1L,
  block_size = -1L,
  bycols = FALSE,
  out_group = "SPLIT",
  out_dataset = NULL,
  overwrite = FALSE,
  ...
)

Arguments

x

An HDF5Matrix.

f

Ignored (kept for S3 signature compatibility).

drop

Ignored (S3 compatibility).

n_blocks

Integer. Number of (roughly equal) blocks; -1 = unused.

block_size

Integer. Max rows (or cols) per block; -1 = unused.

bycols

Logical. If TRUE, split by columns (default = by rows).

out_group

Character. HDF5 group for output blocks (default "SPLIT").

out_dataset

Character or NULL. Base dataset name.

overwrite

Logical. Overwrite existing blocks (default FALSE).

...

Ignored.

Details

Calling convention: use split(x, n_blocks = 4) after loading the package. The form BigDataStatMeth::split() produces an error because split is a base generic — this is normal R behaviour, identical to BigDataStatMeth::cor() or BigDataStatMeth::svd(). The S3 dispatch happens automatically when the package is loaded.

split_dataset() is an alternative with a cleaner signature that omits the f and drop parameters inherited from base::split (which have no meaning for HDF5Matrix objects). Both functions produce identical results.

Value

Named list of HDF5Matrix objects: block_0, block_1, …

See Also

split_dataset for the equivalent with a cleaner signature; hdf5_reduce to recombine blocks after processing.

Examples


fn     <- tempfile(fileext = ".h5")
X      <- hdf5_create_matrix(fn, "data/X", data = matrix(rnorm(2000), 20, 100))
blocks <- split(X, n_blocks = 4)   # 4 row-blocks of 5 rows each
length(blocks)                     # 4
lapply(blocks, close)

hdf5_close_all()
unlink(fn)



BigDataStatMeth documentation built on June 8, 2026, 5:07 p.m.