View source: R/filter_preprocessbigVCF.R
| filterLargeVCF | R Documentation | 
Filter/extract one or multiple gene(s)/range(s) from a large
*.vcf/*.vcf.gz file.
filterLargeVCF(VCFin = VCFin, VCFout = VCFout,
                Chr = Chr,
                POS = NULL,
                start = start,
                end = end,
                override = TRUE)
VCFin | 
 Path of input   | 
VCFout | 
 Path(s) of output   | 
Chr | 
 a single CHROM name or CHROM names vector.  | 
POS, start, end | 
 provide the range should be extract from orignal vcf.
  | 
override | 
 whether override existed file or not, default as   | 
This package import VCF files with 'vcfR' which is more efficient to
import/manipulate VCF files in 'R'. However, import a large VCF file is time and
memory consuming. It's suggested that filter/extract variants in target
range with filterLargeVCF().
When filter/extract multi genes/ranges, the parameter of Chr and POS
must have equal length. Results will save to a single file if the user
provide a single file path or save to multiple VCF file(s) when a equal length
vector consist with file paths is provided.
However, if you have hundreds gene/ranges need to extract from very large VCF file(s), it's prefer to process with other linux tools in a script on server, such as: 'vcftools' and 'bcftools'.
No return value
 # The filteration of small vcf should be done with `filter_vcf()`.
 # however, here, we use a mini vcf instead just for example and test.
 vcfPath <- system.file("extdata", "var.vcf.gz", package = "geneHapR")
 oldDir <- getwd()
 temp_dir <- tempdir()
 if(! dir.exists(temp_dir))
   dir.create(temp_dir)
 setwd(temp_dir)
 # extract a single gene/range from large vcf
 filterLargeVCF(VCFin = vcfPath, VCFout = "filtered.vcf.gz",
                Chr = "scaffold_1", POS = c(4300,5000), override = TRUE)
 # extract multi genes/ranges from large vcf
 filterLargeVCF(VCFin = vcfPath,
                VCFout = c("filtered1.vcf.gz",
                           "filtered2.vcf.gz",
                           "filtered3.vcf.gz"),
                Chr = rep("scaffold_1", 3),
                POS = list(c(4300, 5000),
                           c(5000, 6000),
                           c(5000, 7000)),
                override = TRUE)
setwd(oldDir)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.