| hdf5matrix_options | R Documentation |
Configure global settings for parallelization, block processing and compression in HDF5Matrix operations. These settings affect all HDF5Matrix computations unless explicitly overridden in individual method calls.
hdf5matrix_options(
paral = NULL,
block_size = NULL,
threads = NULL,
compression = NULL
)
paral |
Logical or NULL. Enable OpenMP parallelization?
|
block_size |
Integer or NULL. Number of elements per block for block-wise processing.
|
threads |
Integer or NULL. Number of OpenMP threads to use.
|
compression |
Integer (0-9) or NULL. gzip compression level for created datasets.
|
BigDataStatMeth achieves high performance through two key mechanisms:
Block-wise processing:
Large matrices are processed in chunks that fit in memory. The block_size
parameter controls chunk size. Smaller blocks use less memory but require more
I/O operations. Larger blocks are faster but require more RAM.
OpenMP parallelization:
Operations are distributed across CPU cores. The paral and threads
parameters control this. Parallelization provides near-linear speedup for
compute-intensive operations.
Compression:
Datasets are created with gzip compression (level 6 by default). This reduces
disk usage by 60-80\
For benchmarks or workflows where speed is critical, set compression = 0.
For long-term storage or large datasets, keep the default.
Priority:
Options set here serve as defaults. Individual method calls can override:
A$multiply(B, paral = TRUE, threads = 4, block_size = 2000)
Recommendations:
For interactive analysis: Leave defaults (NULL) - auto-detect works well
For scripts/HPC: Set explicitly based on your hardware and data size
For huge datasets (>10GB): Reduce block_size to fit in RAM
For many-core systems: Set threads explicitly (auto may be too aggressive)
For benchmarks: Set compression = 0 to eliminate gzip overhead
When called with arguments: invisibly returns a list of all current options. When called without arguments: returns a list of all current options.
# View current options
hdf5matrix_options()
# Enable parallelization with 8 threads
hdf5matrix_options(paral = TRUE, threads = 8)
# Set block size to 1000 elements
hdf5matrix_options(block_size = 1000)
# Disable compression for benchmarking
hdf5matrix_options(compression = 0)
# Reset to defaults
hdf5matrix_options(paral = NULL, threads = NULL, block_size = NULL, compression = NULL)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.