Description Usage Arguments Details Value Author(s) References See Also Examples
Converts a SeqArray GDS file to a Variant Call Format (VCF) file.
1 2 | seqGDS2VCF(gdsfile, vcf.fn, info.var=NULL, fmt.var=NULL, chr_prefix="",
use_Rsamtools=TRUE, verbose=TRUE)
|
gdsfile |
a |
vcf.fn |
the file name, output a file of VCF format; or a
|
info.var |
a list of variable names in the INFO field, or NULL for
using all variables; |
fmt.var |
a list of variable names in the FORMAT field, or NULL for
using all variables; |
chr_prefix |
the prefix of chromosome, e.g., "chr"; no prefix by default |
use_Rsamtools |
|
verbose |
if |
seqSetFilter
can be used to define a subset of data for
the export.
If the filename extension is "gz", the gzip compression algorithm is used
to compress the output data. When the Rsamtools package is installed and
use_Rsamtools=TRUE
, the exported file utilizes the bgzf format
(bgzip, a variant of gzip format) allowing for fast
indexing.
Return the file name of VCF file with an absolute path.
Xiuwen Zheng
Danecek, P., Auton, A., Abecasis, G., Albers, C.A., Banks, E., DePristo, M.A., Handsaker, R.E., Lunter, G., Marth, G.T., Sherry, S.T., et al. (2011). The variant call format and VCFtools. Bioinformatics 27, 2156-2158.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | # the GDS file
(gds.fn <- seqExampleFileName("gds"))
# display
(f <- seqOpen(gds.fn))
# output the first 10 samples
samp.id <- seqGetData(f, "sample.id")
seqSetFilter(f, sample.id=samp.id[1:5])
# convert
seqGDS2VCF(f, "tmp.vcf.gz")
# no INFO and FORMAT
seqGDS2VCF(f, "tmp1.vcf.gz", info.var=character(), fmt.var=character())
# output BN,GP,AA,DP,HM2 in INFO (the variables are in this order), no FORMAT
seqGDS2VCF(f, "tmp2.vcf.gz", info.var=c("BN","GP","AA","DP","HM2"),
fmt.var=character())
# read
(txt <- readLines("tmp.vcf.gz", n=20))
(txt <- readLines("tmp1.vcf.gz", n=20))
(txt <- readLines("tmp2.vcf.gz", n=20))
#########################################################################
# Users could compare the new VCF file with the original VCF file
# call "diff" in Unix (a command line tool comparing files line by line)
# using all samples and variants
seqResetFilter(f)
# convert
seqGDS2VCF(f, "tmp.vcf.gz")
# file.copy(seqExampleFileName("vcf"), "old.vcf.gz", overwrite=TRUE)
# system("diff <(gunzip -c old.vcf.gz) <(gunzip -c tmp.vcf.gz)")
# 1a2,3
# > ##fileDate=20130309
# > ##source=SeqArray_RPackage_v1.0
# LOOK GOOD!
# delete temporary files
unlink(c("tmp.vcf.gz", "tmp1.vcf.gz", "tmp2.vcf.gz"))
# close the GDS file
seqClose(f)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.