loadVariants: Load annotated SVs file

loadVariantsR Documentation

Load annotated SVs file

Description

After annotation procedure, SVs are then loaded and formatted properly for model training

Usage

loadVariants(input)

Arguments

input

input file after annotation procedure

Value

formatted data.frame to be used for model training including following fields:

  • ChrA: first chromsome number;

  • Start: start position of first chromsome;

  • ChrB: second chromsome number;

  • End: end position of second chromsome;

  • SVLen: length of SV;

  • Germline: integer indicating germline artifact;

  • SVType: character value indicating SV type, could be one of the following 'DEL', 'DUP', 'INV', 'BND';

  • PE: number of paired end reads support;

  • SR: number of split reads support;

  • CNVMAP: 0/1 indicating whether the same SV is a known SV based on polymophic CNV database (https://www.nature.com/articles/nrg3871);

  • CNVR: 0/1 indicating whether the same SV is a known SV based on 1k genome database (https://www.nature.com/articles/nature15393);

  • avgL: average of strict mask L (depth of coverage is much lower than average). All strict mask averages are calcualted using start (100bp) and end (100bp) regions of the SV

  • avgH: average of strict mask H (depth of coverage is much higher than average).

  • avgZ: average of strict mask Z (too many reads with zero mapping quality overlap this position).

  • : avgQ: average of strict mask Q (average mapping quality is too low).

Examples

data(ExampleData, package='DVboost')
### example data loaded and formatted with loadVariants
str(ExampleData)

Liuy12/DVboost documentation built on May 25, 2022, 6:17 a.m.