
h5lite is the pain-free way to work with HDF5 files in R.
It is designed for data scientists who want to read/write objects and move on, and for package developers who need a reliable, dependency-free storage backend.
If you've struggled with complex HDF5 bindings in the past, h5lite offers a fresh approach:
It Just Works: No need to understand HDF5 dataspaces, hyperslabs, or property lists. h5lite maps R objects (numeric, character, factor, data.frame, and more) directly to their HDF5 equivalents.
Zero System Dependencies: h5lite bundles the HDF5 library (via hdf5lib). Users do not need to install HDF5 system libraries manually.
Smart Defaults, Full Control: It automatically selects the most efficient data types (e.g., saving space by storing small integers as int8), but gives you granular control when you need to conform to a strict spec.
Install the released version from CRAN:
install.packages("h5lite")
Or the development version from GitHub:
# install.packages("pak")
pak::pak("cmmr/h5lite")
The API consists primarily of two functions: h5_write() and h5_read().
library(h5lite)
file <- tempfile(fileext = ".h5")
# 1. Write simple objects
h5_write(1:10, file, "my_vector")
h5_write(I(42), file, "my_vector", attr = "my_id")
h5_write(matrix(rnorm(9), 3, 3), file, "my_matrix")
# 2. Write a list (creates a group hierarchy)
config <- list(version = 1.0, params = list(a = 1, b = 2))
h5_write(config, file, "simulation_config")
# 3. Read it back
my_vec <- h5_read(file, "my_vector")
# 4. Inspect the file
h5_ls(file)
#> [1] "my_vector" "my_matrix"
#> [3] "simulation_config" "simulation_config/version"
#> [5] "simulation_config/params" "simulation_config/params/a"
#> [7] "simulation_config/params/b"
h5_str(file)
#> /
#> ├── my_vector <uint8 × 10>
#> │ └── @my_id <uint8 scalar>
#> ├── my_matrix <float64 × 3 × 3>
#> └── simulation_config/
#> ├── version <uint8 × 1>
#> └── params/
#> ├── a <uint8 × 1>
#> └── b <uint8 × 1>
as Argument: Precise ControlNeed to conform to a specific file specification? The as argument allows you to override automatic behavior and explicitly define on-disk types.
# Force specific numeric types
h5_write(1:10, file, "dataset_a", as = "int32")
h5_write(rnorm(10), file, "dataset_b", as = "float32")
# Control string lengths (e.g., fixed-length ASCII for compatibility)
h5_write(c("A", "B"), file, "fixed_strs", as = "ascii[10]")
h5_str(file)
#> ...
#> ├── dataset_a <int32 × 10>
#> ├── dataset_b <float32 × 10>
#> └── fixed_strs <ascii[10] × 2>
When writing Data Frames, you can map types for specific columns using a named vector.
df <- data.frame(
id = 1:5,
score = c(1.1, 2.2, 3.3, 4.4, 5.5),
note = c("a", "b", "c", "d", "e")
)
# Store 'id' as 16-bit integer, 'score' as 32-bit float, and coerce 'note' to ascii
h5_write(df, file, "experiment_data",
as = c(id = "uint16", score = "float32", note = "ascii"))
h5_str(file)
#> ...
#> └── experiment_data <compound[3] × 5>
#> ├── $id <uint16>
#> ├── $score <float32>
#> └── $note <ascii>
h5lite natively bundles an extensive suite of state-of-the-art compression filters (including Blosc2, Zstandard, LZ4, and lossy ZFP). For simple use cases, you can pass a string configuration to the compress argument. For precise control over the pipeline—including chunk sizing, bitshuffling, and Scale-Offset scaling—use the h5_compression() function.
# Simple setup: High-performance Blosc2 with Zstandard
h5_write(rnorm(1000), file, "data_blosc", compress = "blosc2-zstd")
# Advanced setup: LZ4 compression, optimal integer packing, and custom chunk sizing
cmp <- h5_compression("lz4-9", int_packing = TRUE, chunk_size = 512 * 1024)
h5_write(1:1000, file, "data_custom", compress = cmp)
For large datasets that exceed system RAM, h5lite provides partial reading via start and count parameters. It automatically targets the most logical dimension (e.g., rows in a matrix or elements in a vector).
# Read a 100-row slice starting from row 500
subset <- h5_read(file, "large_matrix", start = 500, count = 100)
| Feature | h5lite | rhdf5 / hdf5r |
| :-------------------- | :------------------------ | :----------------------------------------------------- |
| Philosophy | "Opinionated" & Simple | Comprehensive Wrapper |
| API Style | Native R (read/write) | Low-level (Files, Dataspaces, Memspaces) |
| HDF5 Installation | Bundled (Zero-config) | System Requirement (Manual install often required) |
| Data Typing | Automatic (safe defaults) | Manual (user specified) |
| Partial I/O | Supported (Simplified)| Supported (Manual hyperslabs) |
| Learning Curve | Low (Minutes) | High (Days) |
Use rhdf5 or hdf5r if you need to:
h5lite (e.g., bitfields, references, variable-length nested arrays).start/count parameters.Use h5lite if you want to:
start and count.h5_open() handle for a streamlined workflow.Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.