seqRecompress: Recompress the GDS file

View source: R/UtilsExport.R

seqRecompressR Documentation

Recompress the GDS file

Description

Recompress the SeqArray GDS file.

Usage

seqRecompress(gds.fn, compress=c("ZIP", "LZ4", "LZMA", "Ultra", "UltraMax", "none"),
    exclude=character(), optimize=TRUE, verbose=TRUE)

Arguments

gds.fn

the file name of SeqArray file

compress

the compression method, compress="ZIP" by default

exclude

a list of GDS nodes to be excluded, see details

optimize

if TRUE, optimize the access efficiency by calling cleanup.gds

verbose

if TRUE, show information

Details

This function requires gdsfmt (>= v1.17.2). seqVCF2GDS usually takes lots of memory when the compression method "LZMA_RA.max", "Ultra" or "UltraMax" is specified. So users could call seqVCF2GDS(, storage.option="ZIP_RA") first, and then recompress the GDS file with a higher compression option, e.g., "UltraMax". seqRecompress() takes much less memory than seqVCF2GDS(), since it recompresses data in a GDS node each time.

"UltraMax" might be not better than "Ultra", and its behavior is similar to xz -9 --extreme: use a slower variant of the selected compression preset level (-9) to hopefully get a little bit better compression ratio, but with bad luck this can also make it worse.

ls.gdsn(gdsfile, include.hidden=TRUE, recursive=TRUE) returns a list of GDS nodes to be re-compressed, and users can specify the excluded nodes in the argument exclude.

Value

None.

Author(s)

Xiuwen Zheng

See Also

seqVCF2GDS, seqStorageOption

Examples

gds.fn <- seqExampleFileName("gds")
file.copy(gds.fn, "tmp.gds")

seqRecompress("tmp.gds", "LZMA")

unlink("tmp.gds")

zhengxwen/SeqArray documentation built on Jan. 10, 2025, 9:09 p.m.