read_vcf_cpp: Read VCF using CPP reader

View source: R/RcppExports.R

read_vcf_cppR Documentation

Read VCF using CPP reader

Description

For each VCF record the information in the INFO field is used in priority. If missing, information is guessed from the REF/ALT sequences. If multiple alleles are defined in ALT, they are split and the allele count extracted from the GT field.

Usage

read_vcf_cpp(
  filename,
  use_gz,
  sample_name = "",
  min_sv_size = 10L,
  shorten_ref = TRUE,
  shorten_alt = TRUE,
  gq_field = "GQ",
  check_inv = FALSE,
  keep_nocalls = FALSE,
  other_fields = as.character(c())
)

Arguments

filename

the path to the VCF file (unzipped or gzipped).

use_gz

is the VCF file gzipped?

sample_name

which sample to process. If not found, uses first sample in VCF file. If "*", force no sample selection

min_sv_size

minimum variant size to keep in bp. Variants shorter than this will be skipped. Default is 10.

shorten_ref

should the REF sequence be shortened to the first 10 bp. Default is TRUE

shorten_alt

should the ALT sequence be shortened to the first 10 bp. Default is TRUE

gq_field

which field from FORMAT should be used as genotype quality. Default is "GQ". If not found, QUAL will be used

check_inv

guess if a variant is an inversion by aligning REF with the reverse complement of ALT. If >80% similar (and REF and ALT>10bp), variant is classified as INV.

keep_nocalls

should we keep variants/alleles with missing genotypes (e.g. "./."). Default is FALSE

other_fields

name of another field from INFO to extract.

Details

Alleles are split and, for each, column 'ac' reports the allele count. Notable cases incude 'ac=-1' for no/missing calls (e.g. './.'), and 'ac=0' on the first allele to report hom ref, variants. These cases are often filtered later with 'ac>0' to keep only non-ref calls. If the VCF contains no samples or if no sample selection if forced (sample_name='*'), 'ac' will contain '-1' for all variants in the VCF.

Value

data.frame with variant and genotype information

Author(s)

Jean Monlong


jmonlong/sveval documentation built on July 31, 2023, 7:50 p.m.