seqMerge | R Documentation |
Merges multiple SeqArray GDS files.
seqMerge(gds.fn, out.fn, storage.option="LZMA_RA", info.var=NULL, fmt.var=NULL,
samp.var=NULL, optimize=TRUE, digest=TRUE, geno.pad=TRUE, verbose=TRUE)
gds.fn |
the file names of multiple GDS files |
out.fn |
the output file name |
storage.option |
specify the storage and compression option,
"ZIP_RA" ( |
info.var |
characters, the variable name(s) in the INFO field;
|
fmt.var |
characters, the variable name(s) in the FORMAT field;
|
samp.var |
characters, the variable name(s) in 'sample.annotation';
or |
optimize |
if |
digest |
a logical value (TRUE/FALSE) or a character ("md5", "sha1", "sha256", "sha384" or "sha512"); add md5 hash codes to the GDS file if TRUE or a digest algorithm is specified |
geno.pad |
TRUE, pad a 2-bit genotype array in bytes to avoid recompressing genotypes if possible |
verbose |
if |
The function merges multiple SeqArray GDS files. Users can specify the
compression method and level for the new GDS file. If gds.fn
contains
one file, users can change the storage type to create a new file.
WARNING: the functionality of seqMerge()
is limited.
Return the file name of GDS format with an absolute path.
Xiuwen Zheng
seqVCF2GDS
, seqExport
# the VCF file
vcf.fn <- seqExampleFileName("vcf")
# the number of variants
total.count <- seqVCF_Header(vcf.fn, getnum=TRUE)$num.variant
split.cnt <- 5
start <- integer(split.cnt)
count <- integer(split.cnt)
s <- (total.count+1) / split.cnt
st <- 1L
for (i in 1:split.cnt)
{
z <- round(s * i)
start[i] <- st
count[i] <- z - st
st <- z
}
fn <- paste0("tmp", 1:split.cnt, ".gds")
# convert to 5 gds files
for (i in 1:split.cnt)
{
seqVCF2GDS(vcf.fn, fn[i], storage.option="ZIP_RA",
start=start[i], count=count[i])
}
# merge different variants
seqMerge(fn, "tmp.gds", storage.option="ZIP_RA")
seqSummary("tmp.gds")
#### merging different samples ####
vcf.fn <- seqExampleFileName("gds")
file.copy(vcf.fn, "test.gds", overwrite=TRUE)
# modify 'sample.id'
seqAddValue("test.gds", "sample.id", paste0("S", 1:90), replace=TRUE)
# merging
seqMerge(c(vcf.fn, "test.gds"), "output.gds", storage.option="ZIP_RA")
# delete the temporary files
unlink(c("tmp.gds", "test.gds", "output.gds"), force=TRUE)
unlink(fn, force=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.