View source: R/S3_operations.R
| split.HDF5Matrix | R Documentation |
S3 method of base::split() for HDF5Matrix objects.
Divides the matrix into equal-sized blocks along rows (default) or columns,
storing each block as a separate dataset in the same HDF5 file.
Provide exactly ONE of n_blocks or block_size.
## S3 method for class 'HDF5Matrix'
split(
x,
f = NULL,
drop = FALSE,
n_blocks = -1L,
block_size = -1L,
bycols = FALSE,
out_group = "SPLIT",
out_dataset = NULL,
overwrite = FALSE,
...
)
x |
An |
f |
Ignored (kept for S3 signature compatibility). |
drop |
Ignored (S3 compatibility). |
n_blocks |
Integer. Number of (roughly equal) blocks; -1 = unused. |
block_size |
Integer. Max rows (or cols) per block; -1 = unused. |
bycols |
Logical. If |
out_group |
Character. HDF5 group for output blocks (default |
out_dataset |
Character or NULL. Base dataset name. |
overwrite |
Logical. Overwrite existing blocks (default |
... |
Ignored. |
Calling convention: use split(x, n_blocks = 4) after loading
the package. The form BigDataStatMeth::split() produces an error
because split is a base generic — this is normal R behaviour,
identical to BigDataStatMeth::cor() or BigDataStatMeth::svd().
The S3 dispatch happens automatically when the package is loaded.
split_dataset() is an alternative with a cleaner signature that
omits the f and drop parameters inherited from
base::split (which have no meaning for HDF5Matrix objects).
Both functions produce identical results.
Named list of HDF5Matrix objects:
block_0, block_1, …
split_dataset for the equivalent with a cleaner
signature; hdf5_reduce to recombine blocks after processing.
fn <- tempfile(fileext = ".h5")
X <- hdf5_create_matrix(fn, "data/X", data = matrix(rnorm(2000), 20, 100))
blocks <- split(X, n_blocks = 4) # 4 row-blocks of 5 rows each
length(blocks) # 4
lapply(blocks, close)
hdf5_close_all()
unlink(fn)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.