Read variants from disk and perform some pre-processing. In contrast to obtaining a data.frame using read.table(), this function does not crash when a vcf file does not have any valid rows (i.e. no variants detected). This function also automatically allows the user to discard some columns of the input table.
1 2 | readVariantsFromFile(ff, n = 4096, ignorelines = c("thesaurushard",
"thesaurusmany"), getcolumns = 10, withindels = FALSE)
|
ff |
filename |
n |
integer, determines how many lines are read from the input to scan for the vcf header. Set this to a moderatly large number, i.e. larger than the expected number of header lines in the vcf file. |
ignorelines |
vector of filter codes. Lines in the vcf that contain one of these filter codes will be eliminated, i.e. not reported in the output |
getcolumns |
number of columns to read/return. Use this to select only a few of the columns of the vcf table (ignoring the info or format fields can save memory) |
withindels |
logical. Set to TRUE to return all variants in input vcf, or set to FALSE to obtain only single-nucleotide substitutions, |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.