readThesaurusSet: Read a set of variants annotated using GeneticThesaurus

Description Usage Arguments Details

Description

The GeneticThesaurus software for annotating variants with thesaurus links typically creates three output files - a .vcf file with called variants, a .vtf file with links, and a baf.tsv file with allelic frequencies. This function here reads these three file types together into a single R object (a list with the contents of the three files in data frames).

Usage

1
2
readThesaurusSet(variantsfile, n = 4096, ignorelines = c("thesaurushard",
  "thesaurusmany"), getcolumns = 10, withindels = FALSE)

Arguments

variantsfile

filename for a vcf file annotated with the genetic thesaurus. This function assumes variantsfile ends with vcf (or vcf.gz), and then looks for files with similar name with extensions vtf (or vtf.gz) and baf.tsv (or baf.tsv.gz)

n

number of lines to read at a time (used by readVariantsFromFile and readLinksFromFile

ignorelines

filter codes to remove from variant list

getcolumns

integer; this is passed on to function readVariantsFromFile

withindels

logical; determines whether variant list should include insertions/deletions

Details

Note: the individual files are often large, so this function may take up to a few minutes to complete.


tkonopka/RGeneticThesaurus documentation built on May 31, 2019, 3:44 p.m.