summaryVariant: Summary per variant

summaryVariantR Documentation

Summary per variant

Description

Compute the mean, sd, min, Q1, med, mean, Q3, max of the genotype qualities per variant, also reporting the number of samples and the number of missing data.

Usage

summaryVariant(
  vcf.file,
  genome = "",
  yieldSize = NA_integer_,
  dict.file = NULL,
  seq.id = NULL,
  seq.start = NULL,
  seq.end = NULL,
  fields = "GQ",
  verbose = 1
)

Arguments

vcf.file

path to the VCF file (if the bgzip index doesn't exist in the same directory, it will be created)

genome

genome identifier (e.g. "VITVI_12x2")

yieldSize

number of records to yield each time the file is read from (see ?TabixFile) if seq.id is NULL

dict.file

path to the SAM dict file (see https://broadinstitute.github.io/picard/command-line-overview.html#CreateSequenceDictionary) if seq.id is specified with no start/end

seq.id

sequence identifier to work on (e.g. "chr2")

seq.start

start of the sequence to work on

seq.end

end of the sequence to work on

fields

genotype field(s) of the VCF to parse ("DP"/"GQ"/c("DP","GQ"))

verbose

verbosity level (0/1)

Value

list of matrices (one per field) with one row per variant and 9 columns (n, na, mean, sd, min, q1, med, q3, max)

Author(s)

Timothee Flutre

See Also

varqual2summary


timflutre/rutilstimflutre documentation built on Feb. 7, 2024, 8:17 a.m.