Whole genome sequencing (WGS) has successfully been used to identify single-nucleotide variants (SNV), small insertions and deletions and, more recently, small copy number variants. However, due to utilization of short reads, it is not well suited for identification of structural variants (SV) and the majority of SV calling tools having high false positive and negative rates.Optical next-generation mapping (NGM) utilizes long fluorescently labeled native-state DNA molecules for de novo genome assembly to overcome the limitations of WGS. NGM allows for a significant increase in SV detection capability. However, accuracy of SV annotation is highly important for variant classification and filtration to determine pathogenicity.Here we create a new tool in R, for SV annotation, including population frequency and gene function and expression, using publicly available datasets. We use DGV (Database of Genomic Variants), to calculate the population frequency of the SVs identified by the Bionano SVCaller in the NGM dataset of a cohort of 50 undiagnosed patients with a variety of phenotypes. The new annotation tool, nanotatoR, also calculates the internal frequency of SVs, which could be beneficial in identification of potential false positive or common calls. The software creates a primary gene list (PG) from NCBI databases based on patient phenotype and compares the list to the set of genes overlapping the patient’s SVs generated by SVCaller, providing analysts with an easy way to identify variants affecting genes in the PG. The output is given in an Excel file format, which is subdivided into multiple sheets based on SV type. Users then have a choice to filter SVs using the provided annotation for identification of inherited, de novo or rare variants. nanotatoR provides integrated annotation and the expression patterns to enable users to identify potential pathogenic SVs with greater precision and faster turnaround times.
|Author||Surajit Bhattacharya,Hayk Barsheghyan, Emmanuele C Delot and Eric Vilain|
|Bioconductor views||GenomeAssembly Software VariantAnnotation WorkflowStep|
|Maintainer||Surajit Bhattacharya <[email protected]>|
|Package repository||View on Bioconductor|
Install the latest version of this package by entering the following in R:
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.