GuessDType: Guess an HDF5 Datatype

View source: R/zzz.R

GuessDTypeR Documentation

Guess an HDF5 Datatype

Description

Wrapper around hdf5r::guess_dtype, allowing for the customization of string types rather than defaulting to variable-length ASCII-encoded strings. Also encodes logicals as H5T_INTEGER instead of H5T_LOGICAL to ensure cross-language compatibility (controlled via package options)

Usage

GuessDType(x, stype = "utf8", ...)

Arguments

x

The object for which to guess the HDF5 datatype or the dimension or the number of elements

stype

Type of string encoding to use, choose from:

utf8

Variable-width, UTF-8

ascii7

Fixed-width (7 bits), ASCII

...

Arguments passed on to hdf5r::guess_dtype

ds_dim

Can explicitly set the dimension of the dataset object. For scalar, this is one. Otherwise, this can be used so that a multi-dimensional object can be represented so that some of its dimension are in the dataset, and some are inside an H5T_ARRAY

scalar

Should the datatype be created so that x can be represented as a scalar with that datatype? This is intended to know if a vector/array should be represented as an H5T_ARRAY or not.

string_len

If a string is in the R object, the length to which the corresponding HDF5 type should be set. If it is a positive integer, the string is of that length. If it is Inf, it is variable length. If it is set to estimate, it is set to the length of the longest string in the x.

Value

An object of class H5T

See Also

guess_dtype BoolToInt StringType

Examples


# Characters can either be variable-width UTF8-encoded or
# fixed-width ASCII-encoded
SeuratDisk:::GuessDType(x = 'hello')
SeuratDisk:::GuessDType(x = 'hello', stype = 'ascii7')

# Data frames are a compound type; character columns follow the same rules
# as character vectors
df <- data.frame(x = c('g1', 'g2', 'g3'), y = 1, 2, 3, stringsAsFactors = FALSE)
SeuratDisk:::GuessDType(x = df)
SeuratDisk:::GuessDType(x = df, stype = 'ascii7')

# Logicals are turned into integers to ensure compatibility with Python
# TRUE evaluates to 1, FALSE to 0, and NA to 2
SeuratDisk:::GuessDType(x = c(TRUE, FALSE, NA))



mojaveazure/seurat-disk documentation built on Nov. 5, 2023, 9:40 a.m.