h5 provides an interface to the HDF5 API through S4-classes. HDF5 is a binary data format designed for flexible and efficient I/O, high–volume and complex data. An HDF5 file can be structured in a hierarchical way to store data sets in groups—quite similar to the folder structure in a file system. It supports fast storage and retrieval of R-objects like vectors, matrices and arrays to binary files in a language independent format (currently no data.frames). The package can therefore be used as an alternative to R's save/load mechanism. Since h5 is able to access only subsets of stored data it can also handle data sets which do not fit into memory.
h5 can currently only handle homogeneous data sets consisting of
one single data type like numeric
, integer
, character
or
logical
. The creation of metadata through attributes is also supported.
The following objects are supported by h5 and represented through S4 classes:
holds the pointer to the binary HDF5 file which can
include various DataSets
in a hierarchical structure defined by
H5Groups
.
can hold various HDF5 objects like DataSets
and other H5Groups
.
stores homogeneous data like vectors, matrices and arrays.
stores metadata about other HDF5
objects like H5Group
, H5File
and DataSet
.
Objects defining selections on specified
DataSets
.
These classes share common functionality through the following base classes:
implements common functionality for
H5File
and H5Group
to create/access
sub–H5Group
s and DataSet
s.
is the base class of H5File
,
H5Group
and DataSet
and implements functions for
Attribute
creation and retrieval.
The example below shows some typical use cases handling data with HDF5:
Create/Open HDF5 File using H5File
, specifying file
access mode.
Create/Open Groups and DataSets either implicitly using subsetting
operators or explicitly using the S4–methods like
createGroup/openGroup
or
createDataSet/openDataSet
, see also
CommonFG, CommonFG-Group and
CommonFG-DataSet.
Create/Open meta data for HDF5 objects using e.g.
h5attr
, see also H5Location-Attribute and
Attribute.
Retrieve data from DataSets
either implicitly using subsetting
operators or explicitly with readDataSet
which
requires a DataSpace
object to specify the selection area, see
also DataSet, DataSet-Subset and DataSpace.
Extend DataSet
using predefined functions like c
for 1-dim.
vectors or rbind
/cbind
2-dimensional DataSets
, see also
DataSet-Extend.
Close H5File
–, H5Group
– DataSet
– or
DataSpace
objects using h5close
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | # 1. Create/Open file 'test.h5' (mode set to 'a'ppend)
file <- h5file("test.h5", 'a')
# 2. Store character vector in group '/test' and dataset 'testvec'
file["test/testvec"] <- LETTERS[1:9]
# Store integer matrix in group '/test/testmat' and dataset 'testmat'
mat <- matrix(1:9, nrow = 3)
rownames(mat) <- LETTERS[1:3]
colnames(mat) <- c("A", "BE", "BUU")
file["test/testmat/testmat"] <- mat
# Store numeric array in group '/test' and dataset 'testarray'
file["test/testarray"] <- array(as.numeric(1:45), dim = c(3, 3, 5))
# 3. Store rownames and column names of matrix as attributes
# Get created data set as object
dset <- file["test/testmat/testmat"]
# Store rownames in attribute 'dimnames_1'
h5attr(dset, "dimnames_1") <- rownames(mat)
# Store columnnames in attribute 'dimnames_2'
h5attr(dset, "dimnames_2") <- colnames(mat)
# 4. Read first 3 elements of testvec
testvec <- file["test/testvec"]
testvec[1:3]
# Read first 2 rows of testmat
testmat <- file["test/testmat/testmat"]
res <- testmat[1:2, ]
# attach rownames and columnnames
rownames(res) <- attr(testmat, "rownames")[1:2]
colnames(res) <- attr(testmat, "colnames")
# 5. Extend testvec
testvec <- c(testvec, LETTERS[10:26])
# Retrieve entire testvec
testvec[]
# 6. Close open handles
h5close(testvec)
h5close(testmat)
h5close(file)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.