Databases in annovarR

knitr::opts_chunk$set(comment = "#>", collapse = TRUE)


BioInstaller is a dependence project of annovarR, which provides the download service of all annovarR supported databases.

# Get all BioInstaller supported softwares, databases and files
BioInstaller::install.bioinfo(show.all.names = TRUE)
# Only db_annovar.toml in BioInstaller be included in annovarR
annovarR::download.database(show.all.names = TRUE)

In this reference manual, we arranged the databases in annovarR and provide several meta information about the annovarR supported databases and other BioInstaller supported (download only) databases. Some of description or comments have been described in the download configuration file (BioInstaller package) and the annotation configuration file (annovarR).


annovarR supported databases will firstly be supported to download from original sites (exclude the authentication part) in BioInstaller. A portion of BioInstaller supported databases will be introduced in annovarR to as the candidate annotation databases (process method: remain unchanged, re-formate, re-analysis).

Gene and Clincal Annotation

Gene annotation databases contain the gene classification, gene function and phenotype correlation, such as HGNC, OMIM DoCM, CIVic, DisGeNET, ClinVar, and Gene Ontology (GO), .etc.

Variant Effect Prediction

Variant effect prediction databases contain the various databases generated by the algorithms for prediction of variants effect on protein or RNA structural, such as SIFT, PolyPhen2, PROVEAN, MutationTaster, MutationAssessor, FATHMM, .etc.

Population Allele Frequency

Population allele frequency databases contain the databases based on the population cohort genome sequencing data (mainly include whole genome sequencing and whole exome sequencing), such as 1000 Genome Project, NHLBI GO Exome Sequencing Project (ESP), gnomAD and ExAC, .etc.

Cancer Somatic Mutation

Cancer somatic mutation databases generated by the cancer patients case-control paired genomic sequence data, such as COSMIC, Cancer Hotspots, intogen and Cancer Biomarkers database, .etc.

RNA-seq Variants

RNA-seq variants databases contributed by variants called from RNA-seq including expressed allele and RNA-editing. annovarR built an RNA-seq variants database, BRVar, based on 1285 cases B-cell lymphoblastic leukemia (B-ALL) patients RNA-seq data (Four different variants detection method be applied).

Expression Quantitative Trait Locus (eQTL)

eQTL databases contain the candidate locus of genome that have an candidate impact on gene expression level, such as Genotype-Tissue Expression (GTEx) QTL, seeQTL and PancanQTL, .etc.

Non-coding RNA Related

Non-coding RNA databases contain the candidate biomaker or non-coding RNA targeted transcriptional regulation region, such as Cancer-Specific CirRNA Database and (LNCediting)[], .etc.

TODO: Finish full document in the next release.

Try the annovarR package in your browser

Any scripts or data that you put into this service are public.

annovarR documentation built on Jan. 9, 2018, 5:05 p.m.