split_dataset: Split an HDF5Matrix into multiple block datasets

View source: R/S3_split.R

split_datasetR Documentation

Split an HDF5Matrix into multiple block datasets

Description

Splits an HDF5Matrixinto equal-sized sub-matrices stored as separate datasets in the same HDF5 file.

Output datasets are named <out_group>/<out_dataset>.0, <out_group>/<out_dataset>.1, ... (0-based index).

Exactly one of n_blocks or block_size must be provided.

Usage

split_dataset(x, n_blocks = NULL, block_size = NULL, bycols = FALSE, ...)

## S3 method for class 'HDF5Matrix'
split_dataset(
  x,
  n_blocks = NULL,
  block_size = NULL,
  bycols = FALSE,
  out_group = "SPLIT",
  out_dataset = NULL,
  overwrite = FALSE,
  ...
)

Arguments

x

An HDF5Matrix.

n_blocks

Integer or NULL. Number of blocks.

block_size

Integer or NULL. Rows or columns per block.

bycols

Logical. Split by columns (TRUE) or rows (default FALSE).

...

Ignored.

out_group

Character. Output HDF5 group (default "SPLIT").

out_dataset

Character or NULL. Base dataset name.

overwrite

Logical. Overwrite existing blocks (default FALSE).

Value

A named list of HDF5Matrix objects.

See Also

cbind.HDF5Matrix

Examples


tmp  <- tempfile(fileext = ".h5")
M    <- hdf5_create_matrix(tmp, "data/M", data = matrix(1:60, 6, 10))
blks <- split_dataset(M, n_blocks = 3L)
length(blks)
lapply(blks, close)
close(M)
unlink(tmp)



BigDataStatMeth documentation built on May 15, 2026, 1:07 a.m.