zlib: zlib

zlibR Documentation

zlib

Description

What My Package Offers

This package provides several key features:

Robustness:

Built to handle even corrupted or incomplete gzip data efficiently without causing system failures.

Demonstration:
  compressed_data <- memCompress(charToRaw(paste0(rep("This is an example string. It contains more than just 'hello, world!'", 1000), collapse = ", ")))
  decompressor <- zlib$decompressobj(zlib$MAX_WBITS)
  rawToChar(c(decompressor$decompress(compressed_data[1:300]), decompressor$flush()))  # Still working
  
Compliance:

Strict adherence to the GZIP File Format Specification, ensuring compatibility across systems.

Demonstration:
  compressor <- zlib$compressobj(zlib$Z_DEFAULT_COMPRESSION, zlib$DEFLATED, zlib$MAX_WBITS + 16)
  c(compressor$compress(charToRaw("Hello World")), compressor$flush())  # Correct 31 wbits (or custom wbits you provide)
  # [1] 1f 8b 08 00 00 00 00 00 00 03 f3 48 cd c9 c9 57 08 cf 2f ca 49 01 00 56 b1 17 4a 0b 00 00 00
  
Flexibility:

Ability to manage Gzip streams from REST APIs without the need for temporary files or other workarounds.

Demonstration:
    # Byte-Range Request and decompression in chunks

    # Initialize the decompressor
    decompressor <- zlib$decompressobj(zlib$MAX_WBITS + 16)

    # Define the URL and initial byte ranges
    url <- "https://example.com/api/data.gz"
    range_start <- 0
    range_increment <- 5000  # Adjust based on desired chunk size

    # Placeholder for the decompressed content
    decompressed_content <- character(0)

    # Loop to make multiple requests and decompress chunk by chunk
    for (i in 1:5) {  # Adjust the loop count based on the number of chunks you want to retrieve
      range_end <- range_start + range_increment

      # Make a byte-range request
      response <- httr::GET(url, httr::add_headers(`Range` = paste0("bytes=", range_start, "-", range_end)))

      # Check if the request was successful
      if (httr::http_type(response) != "application/octet-stream" || httr::http_status(response)$category != "Success") {
        stop("Failed to retrieve data.")
      }

      # Decompress the received chunk
      compressed_data <- httr::content(response, "raw")
      decompressed_chunk <- decompressor$decompress(compressed_data)
      decompressed_content <- c(decompressed_content, rawToChar(decompressed_chunk))

      # Update the byte range for the next request
      range_start <- range_end + 1
    }

    # Flush the decompressor after all chunks have been processed
    final_data <- decompressor$flush()
    decompressed_content <- c(decompressed_content, rawToChar(final_data))
  

In summary, while R’s built-in methods could someday catch up in functionality, the zlib package for now fills an important gap by providing a more robust and flexible way to handle compression and decompression tasks.

Usage

.onLoad(libname, pkgname)

Details

The following 'zlib' enrivonment is generated by the .onLoad Behavior for R packages.

The .onLoad function is automatically called when the package is loaded using library() or require(). It initializes the an environment, which can be reached from anywhere and is unique (i.e. cannot be ovwerwritten), including defining a variety of constants / methods related to the zlib compression library.

Specifically, the function assigns a new environment named "zlib" containing constants such as DEFLATED, DEF_BUF_SIZE, MAX_WBITS, and various flush and compression strategies like Z_FINISH, Z_BEST_COMPRESSION, etc.

Value

No return value, called for side effect. An environment containing the zlib constants created onLoad.

Methods

  • compressobj(...): Create a compression object.

  • decompressobj(...): Create a decompression object.

  • compress(data, ...): Compress data in a single step.

  • decompress(data, ...): Decompress data in a single step.

Constants

  • DEFLATED: The compression method, set to 8.

  • DEF_BUF_SIZE: The default buffer size, set to 16384.

  • DEF_MEM_LEVEL: Default memory level, set to 8.

  • MAX_WBITS: Maximum size of the history buffer, set to 15.

  • Z_BEST_COMPRESSION: Best compression level, set to 9.

  • Z_BEST_SPEED: Best speed for compression, set to 1.

  • Z_BLOCK: Block compression mode, set to 5.

  • Z_DEFAULT_COMPRESSION: Default compression level, set to -1.

  • Z_DEFAULT_STRATEGY: Default compression strategy, set to 0.

  • Z_FILTERED: Filtered compression mode, set to 1.

  • Z_FINISH: Finish compression mode, set to 4.

  • Z_FULL_FLUSH: Full flush mode, set to 3.

  • Z_HUFFMAN_ONLY: Huffman-only compression mode, set to 2.

  • Z_NO_COMPRESSION: No compression, set to 0.

  • Z_NO_FLUSH: No flush mode, set to 0.

  • Z_PARTIAL_FLUSH: Partial flush mode, set to 1.

  • Z_RLE: Run-length encoding compression mode, set to 3.

  • Z_SYNC_FLUSH: Synchronized flush mode, set to 2.

  • Z_TREES: Tree block compression mode, set to 6.

See Also

publicEval() for the method used to set up the public environment.

zlib_constants() for the method used to set up the constants in the environment. https://www.zlib.net/manual.html#Constants

Examples

# Load the package
library(zlib)
# Create a temporary file
temp_file <- tempfile(fileext = ".txt")

# Generate example data and write to the temp file
example_data <- "This is an example string. It contains more than just 'hello, world!'"
writeBin(charToRaw(example_data), temp_file)

# Read data from the temp file into a raw vector
file_con <- file(temp_file, "rb")
raw_data <- readBin(file_con, "raw", file.info(temp_file)$size)
close(file_con)
# Create a Compressor object gzip
compressor <- zlib$compressobj(zlib$Z_DEFAULT_COMPRESSION, zlib$DEFLATED, zlib$MAX_WBITS + 16)

# Initialize variables for chunked compression
chunk_size <- 1024
compressed_data <- raw(0)

# Compress the data in chunks
for (i in seq(1, length(raw_data), by = chunk_size)) {
   chunk <- raw_data[i:min(i + chunk_size - 1, length(raw_data))]
   compressed_chunk <- compressor$compress(chunk)
   compressed_data <- c(compressed_data, compressed_chunk)
}

# Flush the compressor buffer
compressed_data <- c(compressed_data, compressor$flush())


# Create a Decompressor object for gzip
decompressor <- zlib$decompressobj(zlib$MAX_WBITS + 16)

# Initialize variable for decompressed data
decompressed_data <- raw(0)

# Decompress the data in chunks
for (i in seq(1, length(compressed_data), by = chunk_size)) {
  chunk <- compressed_data[i:min(i + chunk_size - 1, length(compressed_data))]
  decompressed_chunk <- decompressor$decompress(chunk)
  decompressed_data <- c(decompressed_data, decompressed_chunk)
}

# Flush the decompressor buffer
decompressed_data <- c(decompressed_data, decompressor$flush())

# Comporess / Decompress data in a single step

original_data <- charToRaw("some data")
compressed_data <- zlib$compress(original_data,
                                 zlib$Z_DEFAULT_COMPRESSION,
                                 zlib$DEFLATED,
                                 zlib$MAX_WBITS + 16)
decompressed_data <- zlib$decompress(compressed_data, zlib$MAX_WBITS + 16)


zlib documentation built on Oct. 19, 2023, 1:13 a.m.

Related to zlib in zlib...