qserialize: qserialize

View source: R/RcppExports.R

qserializeR Documentation

qserialize

Description

Saves an object to a raw vector.

Usage

qserialize(x, preset = "high",
algorithm = "zstd", compress_level = 4L,
shuffle_control = 15L, check_hash=TRUE)

Arguments

x

The object to serialize.

preset

One of "fast", "balanced", "high" (default), "archive", "uncompressed" or "custom". See section Presets for details.

algorithm

Ignored unless preset = "custom". Compression algorithm used: "lz4", "zstd", "lz4hc", "zstd_stream" or "uncompressed".

compress_level

Ignored unless preset = "custom". The compression level used.

For lz4, this number must be > 1 (higher is less compressed).

For zstd, a number between -50 to 22 (higher is more compressed). Due to the format of qs, there is very little benefit to compression levels > 5 or so.

shuffle_control

Ignored unless preset = "custom". An integer setting the use of byte shuffle compression. A value between 0 and 15 (default 15). See section Byte shuffling for details.

check_hash

Default TRUE, compute a hash which can be used to verify file integrity during serialization.

Details

This function serializes and compresses R objects using block compression with the option of byte shuffling.

Value

A raw vector.

Presets

There are lots of possible parameters. To simplify usage, there are four main presets that are performant over a large variety of data:

  • "fast" is a shortcut for algorithm = "lz4", compress_level = 100 and shuffle_control = 0.

  • "balanced" is a shortcut for algorithm = "lz4", compress_level = 1 and shuffle_control = 15.

  • "high" is a shortcut for algorithm = "zstd", compress_level = 4 and shuffle_control = 15.

  • "archive" is a shortcut for algorithm = "zstd_stream", compress_level = 14 and shuffle_control = 15. (zstd_stream is currently single-threaded only)

To gain more control over compression level and byte shuffling, set preset = "custom", in which case the individual parameters algorithm, compress_level and shuffle_control are actually regarded.

Byte shuffling

The parameter shuffle_control defines which numerical R object types are subject to byte shuffling. Generally speaking, the more ordered/sequential an object is (e.g., 1:1e7), the larger the potential benefit of byte shuffling. It is not uncommon to improve compression ratio or compression speed by several orders of magnitude. The more random an object is (e.g., rnorm(1e7)), the less potential benefit there is, even negative benefit is possible. Integer vectors almost always benefit from byte shuffling, whereas the results for numeric vectors are mixed. To control block shuffling, add +1 to the parameter for logical vectors, +2 for integer vectors, +4 for numeric vectors and/or +8 for complex vectors.


qs documentation built on Oct. 2, 2024, 1:07 a.m.