knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
In HDF5, attributes are small pieces of metadata attached to groups or datasets. They are best used to store descriptive information: units, timestamps, descriptions, or experimental parameters—separately from the main data array.
This vignette covers how to write, read, and manage these attributes using h5lite, as well as important limitations regarding their structure.
library(h5lite) file <- tempfile(fileext = ".h5")
There are two ways to write attributes in h5lite: explicitly (targeting an object) or implicitly (saving R attributes).
You can write an attribute to any existing group or dataset using the attr argument in h5_write(). This is useful for adding metadata after the data has been saved.
# First, write a dataset h5_write(1:10, file, "measurements/temperature") # Now, attach attributes to it h5_write(I("Celsius"), file, "measurements/temperature", attr = "units") h5_write(I("2023-10-27"), file, "measurements/temperature", attr = "date") h5_write(I(0.1), file, "measurements/temperature", attr = "precision")
Note: If the attribute already exists, it will be overwritten.
h5lite automatically preserves custom R attributes attached to your objects. When you write an R object, any attributes (except for standard internal ones like dim, names, or class) are written as HDF5 attributes.
# Create a vector with custom R attributes data <- rnorm(5) attr(data, "description") <- I("Randomized control group") attr(data, "valid") <- I(TRUE) # Write the object h5_write(data, file, "experiment/control") # Check the file - the attributes are there h5_attr_names(file, "experiment/control") h5_str(file)
If you only need a specific piece of metadata without reading the full dataset, you can use h5_read(..., attr = "name").
# Read just the 'units' attribute units <- h5_read(file, "measurements/temperature", attr = "units") print(units)
When you read a dataset, h5lite automatically reads all attached attributes and re-attaches them to the resulting R object.
# Read the full dataset temps <- h5_read(file, "measurements/temperature") # The attributes are available in R attributes(temps) str(temps)
Use h5_attr_names() to list the names of all attributes attached to a specific object.
h5_attr_names(file, "measurements/temperature")
You can remove a specific attribute using h5_delete().
# Delete the 'precision' attribute h5_delete(file, "measurements/temperature", attr = "precision") # Verify removal h5_attr_names(file, "measurements/temperature")
While attributes are powerful for storing metadata, they are fundamentally simpler structures than HDF5 Datasets. HDF5 enforces specific constraints that affect how h5lite can store complex R objects as attributes.
HDF5 Dimension Scales (the mechanism h5lite uses to store names, dimnames, and row.names) can only be attached to Datasets. They cannot be attached to attributes.
This means if you write a named vector, matrix, or array as an attribute, the names will be lost.
# A vector with names named_vec <- c(a = 1, b = 2, c = 3) # Write as a standard Dataset -> Names are preserved h5_write(named_vec, file, "my_dataset") h5_names(file, "my_dataset") # Write as an Attribute -> Names are LOST h5_write(named_vec, file, "measurements/temperature", attr = "meta_vec") h5_names(file, "measurements/temperature", attr = "meta_vec")
Exception: Data Frames
There is one major exception: data.frame objects.
Because HDF5 stores data frames as Compound Types, the column names are baked into the type definition itself, not stored as side-loaded metadata. Therefore, column names are preserved even when writing a data frame as an attribute. However, row.names (which rely on dimension scales) will still be lost.
# A data frame with metadata df <- data.frame( id = 1:3, status = c("ok", "fail", "ok") ) # Write as attribute h5_write(df, file, "measurements/temperature", attr = "log") # Column names survive! h5_names(file, "measurements/temperature", attr = "log")
In HDF5, you cannot attach attributes to other attributes. This hierarchy is strictly one level deep: Groups/Datasets can have attributes, but attributes cannot.
Consequently, you cannot treat an attribute as a "Group" or folder to store other items. If you need a hierarchical structure for your metadata, you should create a Group (e.g., /metadata) and store your metadata as Datasets inside it, rather than attaching them as attributes to another object.
Attributes in HDF5 are typed just like datasets. h5lite allows you to control the storage type of attributes using the as argument in h5_write() or h5_read().
To target an attribute specifically, prefix the name with @ in the as vector.
# Write the temperature data again, but use a fixed length string for 'description' h5_write(data, file, "experiment/control", as = c("@description" = "ascii[]")) # Store an attribute as a `uint8` instead of the default `int32` h5_write(I(42), file, "measurements/temperature", "sensor_id", as = "uint8")
You can also coerce attributes when reading them.
# Force the 'valid' attribute to be read as logical, even if stored as integer meta <- h5_read(file, "experiment/control", attr = "valid", as = "logical")
You might notice that standard R attributes like dim are not visible in h5_attr_names().
This is because h5lite handles structural attributes implicitly. The dimensions of the attribute data itself are stored in the HDF5 Dataspace, not as a separate attribute. h5lite automatically restores the dim attribute on the R object when reading, ensuring matrices and arrays retain their shape.
unlink(file)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.