R/compress.R

Defines functions blosc_info blosc_decompress blosc_compress

Documented in blosc_compress blosc_decompress blosc_info

#' Compress and decompress with Blosc
#' 
#' Use the Blosc library to compress or decompress data.
#' @param x In case of `blosc_decompress()`, `x` should always be `raw` data
#' to be decompressed. Use `...` arguments to convert decompressed data
#' to a specific data type.
#' 
#' In case of `blosc_compress()`, `x` should either be `raw` data or a
#' `vector` of data to be compressed. In the latter case, you need to specify
#' `dtype` (see `r_to_dtype()`) in order to convert the data to `raw` information
#' first. See `vignette("blosc-compression")` for more details.
#' @param compressor The compression algorithm to be used. Can be any of
#' `"blosclz"`, `"lz4"`, `"lz4hc"`, `"zlib"`, or `"zstd"`.
#' @param level An `integer` indicating the required level of compression.
#' Needs to be between `0` (no compression) and `9` (maximum compression).
#' @param shuffle A shuffle filter to be activated before compression.
#' Should be one of `"noshuffle"`, `"shuffle"`, or `"bitshuffle"`.
#' @param typesize BLOSC compresses arrays of structured data. This argument
#' specifies the size (`integer`) of the data structure / type in bytes.
#' Default is `4L` bytes (i.e. 32 bits), which would be suitable for compressing
#' 32 bit integers.
#' @param ... Arguments passed to `r_to_dtype()`.
#' @returns In case of `blosc_compress()` a vector of compressed `raw`
#' data is returned. In case of `blosc_decompress()` returns a vector of
#' decompressed `raw` data. Or in in case `dtype` (see `dtype_to_r()`) is
#' specified, a vector of the specified type is returned.
#' @examples
#' my_dat        <- as.raw(sample.int(2L, 10L*1024L, replace = TRUE) - 1L)
#' my_dat_out    <- blosc_compress(my_dat, typesize = 1L)
#' my_dat_decomp <- blosc_decompress(my_dat_out)
#' 
#' ## After compressing and decompressing the data is the same as the original:
#' all(my_dat == my_dat_decomp)
#' @rdname blosc
#' @export
blosc_compress <- function(x, compressor = "blosclz", level = 7L,
                           shuffle = "noshuffle", typesize = 4L, ...) {
  
  typesize <- as.integer(typesize)
  if (typesize < 1L || typesize > 255L)
    stop("Argument 'typesize' out of range (1-255)")
  
  if (!inherits(x, "raw")) {
    dt <- dtype_to_list_(...)
    if (dt$byte_size != typesize)
      stop("Specified `dtype` does not match with provided `typesize`")
    x <- r_to_dtype(x, ...)
  } 
  
  compressor_args <- c("blosclz", "lz4", "lz4hc", "zlib", "zstd")
  compressor <- match.arg(compressor, compressor_args)
  
  shuffle_args <- c("noshuffle", "shuffle", "bitshuffle")
  shuffle <- match.arg(shuffle, shuffle_args)
  shuffle <- match(shuffle, shuffle_args) - 1
  level <- as.integer(level)
  if (level < 0L || level > 9L)
    stop("Compression level should be between 0 (no compression) and 9 (max compression)")
  
  blosc_compress_dat(x, compressor, level, shuffle, typesize)
}

#' @export
#' @rdname blosc
blosc_decompress <- function(x, ...) {
  
  result <- blosc_decompress_dat(x)
  args <- list(x = result, ...)
  if (any(names(args) %in% "dtype"))
    result <- do.call(dtype_to_r, args)
  return(result)
}

#' Information about compressed data
#' 
#' Obtain information about raw data compressed with blosc.
#' @param x Raw data compressed with `blosc_compress()`.
#' @param ... Ignored
#' @returns Returns a named list with information about blosc compressed
#' data `x`.
#' @examples
#' data_compressed <-
#'   blosc_compress(volcano, typesize = 2, dtype = "<i2", compressor = "lz4",
#'                  shuffle = "bitshuffle")
#' 
#' blosc_info(data_compressed)
#'@export
blosc_info <- function(x, ...) {
  blosc_info_(x)
}

Try the blosc package in your browser

Any scripts or data that you put into this service are public.

blosc documentation built on Sept. 11, 2025, 5:11 p.m.