qsave: qsave
In qs: Quick Serialization of R Objects

qsave

R Documentation

qsave

Description

Saves (serializes) an object to disk.

Usage

qsave(x, file,
preset = "high", algorithm = "zstd", compress_level = 4L,
shuffle_control = 15L, check_hash=TRUE, nthreads = 1)

Arguments

`x`	The object to serialize.
`file`	The file name/path.
`preset`	One of `"fast"`, `"balanced"`, `"high"` (default), `"archive"`, `"uncompressed"` or `"custom"`. See section Presets for details.
`algorithm`	Ignored unless `preset = "custom"`. Compression algorithm used: `"lz4"`, `"zstd"`, `"lz4hc"`, `"zstd_stream"` or `"uncompressed"`.
`compress_level`	Ignored unless `preset = "custom"`. The compression level used. For lz4, this number must be > 1 (higher is less compressed). For zstd, a number between `-50` to `22` (higher is more compressed). Due to the format of qs, there is very little benefit to compression levels > 5 or so.
`shuffle_control`	Ignored unless `preset = "custom"`. An integer setting the use of byte shuffle compression. A value between `0` and `15` (default `15`). See section Byte shuffling for details.
`check_hash`	Default `TRUE`, compute a hash which can be used to verify file integrity during serialization.
`nthreads`	Number of threads to use. Default `1`.

Details

This function serializes and compresses R objects using block compression with the option of byte shuffling.

Value

The total number of bytes written to the file (returned invisibly).

Presets

There are lots of possible parameters. To simplify usage, there are four main presets that are performant over a large variety of data:

"fast" is a shortcut for algorithm = "lz4", compress_level = 100 and shuffle_control = 0.
"balanced" is a shortcut for algorithm = "lz4", compress_level = 1 and shuffle_control = 15.
"high" is a shortcut for algorithm = "zstd", compress_level = 4 and shuffle_control = 15.
"archive" is a shortcut for algorithm = "zstd_stream", compress_level = 14 and shuffle_control = 15. (zstd_stream is currently single-threaded only)

To gain more control over compression level and byte shuffling, set preset = "custom", in which case the individual parameters algorithm, compress_level and shuffle_control are actually regarded.

Byte shuffling

The parameter shuffle_control defines which numerical R object types are subject to byte shuffling. Generally speaking, the more ordered/sequential an object is (e.g., 1:1e7), the larger the potential benefit of byte shuffling. It is not uncommon to improve compression ratio or compression speed by several orders of magnitude. The more random an object is (e.g., rnorm(1e7)), the less potential benefit there is, even negative benefit is possible. Integer vectors almost always benefit from byte shuffling, whereas the results for numeric vectors are mixed. To control block shuffling, add +1 to the parameter for logical vectors, +2 for integer vectors, +4 for numeric vectors and/or +8 for complex vectors.

Examples

x <- data.frame(int = sample(1e3, replace=TRUE),
        num = rnorm(1e3),
        char = sample(starnames$`IAU Name`, 1e3, replace=TRUE),
         stringsAsFactors = FALSE)
myfile <- tempfile()
qsave(x, myfile)
x2 <- qread(myfile)
identical(x, x2) # returns true

# qs support multithreading
qsave(x, myfile, nthreads=2)
x2 <- qread(myfile, nthreads=2)
identical(x, x2) # returns true

# Other examples
z <- 1:1e7
myfile <- tempfile()
qsave(z, myfile)
z2 <- qread(myfile)
identical(z, z2) # returns true

w <- as.list(rnorm(1e6))
myfile <- tempfile()
qsave(w, myfile)
w2 <- qread(myfile)
identical(w, w2) # returns true

qs documentation built on April 4, 2025, 5:20 a.m.