knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
Data frames are the workhorse of data analysis in R. In HDF5, data frames are stored as Compound Datasets. This allows different columns to have different data types (e.g., integer, float, string) within the same dataset, much like a SQL table.
This vignette explains how h5lite handles data frames, including row names, factors, and missing values.
library(h5lite) file <- tempfile(fileext = ".h5")
Writing a data frame is as simple as writing any other object. h5lite automatically maps each column to its appropriate HDF5 type.
# Create a standard data frame df <- data.frame( id = 1:5, group = c("A", "A", "B", "B", "C"), score = c(10.5, 9.2, 8.4, 7.1, 6.0), passed = c(TRUE, TRUE, TRUE, FALSE, FALSE), stringsAsFactors = FALSE ) # Write to HDF5 h5_write(df, file, "study_data/results") # Fetch the column names h5_names(file, "study_data/results") # Read back df_in <- h5_read(file, "study_data/results") head(df_in)
You can use the as argument to control the storage type for specific columns. This is passed as a named vector where the names correspond to the column names.
This is particularly useful for optimizing storage (e.g., saving space by storing small integers as int8 or single characters as ascii[1]).
df_small <- data.frame( id = 1:10, code = rep("A", 10) ) # Force 'id' to be uint16 and 'code' to be an ascii string h5_write(df_small, file, "custom_df", as = c(id = "uint16", code = "ascii[]"))
Standard HDF5 Compound Datasets do not have a concept of "row names". However, h5lite preserves them using Dimension Scales.
When you write a data frame with row names, h5lite creates a separate dataset (usually named _rownames) and links it to the main table. When reading, h5lite automatically restores these as the row.names of the data frame.
mtcars_subset <- head(mtcars, 3) h5_write(mtcars_subset, file, "cars") h5_str(file) # Read back result <- h5_read(file, "cars") print(row.names(result))
unlink(file)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.