filterVariantCalls: Filter variant calls

filterVariantCallsR Documentation

Filter variant calls

Description

Filter out variant calls from VCF file according to several criteria (bi-allelic, single nucleotide variant, proper amount of missing genotypes, overall depth and allele frequency).

Usage

filterVariantCalls(
  vcf.file,
  genome = "",
  out.file,
  yieldSize = NA_integer_,
  dict.file = NULL,
  seq.id = NULL,
  seq.start = NULL,
  seq.end = NULL,
  variants.tokeep = NULL,
  is.snv = NULL,
  is.biall = NULL,
  min.var.dp = NULL,
  max.var.dp = NULL,
  min.alt.af = NULL,
  max.alt.af = NULL,
  min.spl.dp = NULL,
  min.perc.spl.dp = NULL,
  min.spl.gq = NULL,
  min.perc.spl.gq = NULL,
  max.var.nb.gt.na = NULL,
  max.var.perc.gt.na = NULL,
  verbose = 1
)

Arguments

vcf.file

path to the VCF file (if the bgzip index doesn't exist in the same directory, it will be created)

genome

genome identifier (e.g. "VITVI_12x2")

out.file

path to the output VCF file (a bgzip index will be created in the same directory)

yieldSize

number of records to yield each time the file is read from (see ?TabixFile) if seq.id is NULL

dict.file

path to the SAM dict file (see https://broadinstitute.github.io/picard/command-line-overview.html#CreateSequenceDictionary) if seq.id is specified with no start/end

seq.id

sequence identifier to work on (e.g. "chr2")

seq.start

start of the sequence to work on (if NULL, whole seq)

seq.end

end of the sequence to work on (if NULL, whole seq)

variants.tokeep

character vector of variant names to keep (e.g. c("chr1:35718_C/A","chr1:61125_A/G"))

is.snv

if not NULL but TRUE, filter out the variants which are not SNVs

is.biall

if not NULL but TRUE, filter out the variants with more than one alternative allele

min.var.dp

minimum variant-level DP below which variants are filtered out

max.var.dp

maximum variant-level DP above which variants are filtered out

min.alt.af

minimum variant-level AF below which variants are filtered out

max.alt.af

maximum variant-level AF above which variants are filtered out

min.spl.dp

minimum sample-level DP

min.perc.spl.dp

minimum percentage of samples with DP above threshold

min.spl.gq

minimum sample-level GQ

min.perc.spl.gq

minimum percentage of samples with GQ above threshold

max.var.nb.gt.na

maximum number of samples with missing GT

max.var.perc.gt.na

maximum percentage of samples with missing GT

verbose

verbosity level (0/1)

Value

the destination file path as a character(1)

Author(s)

Timothee Flutre


timflutre/rutilstimflutre documentation built on Aug. 18, 2024, 7:43 p.m.