Traditional variant calling methods utilize variant allele frequency (VAF) cutoffs to call variants. These cutoffs are often set arbitrarily and the measure becomes problematic when trying to call at variants at low VAFs, where true biological variation becomes hard to distinguish from sequencing error. The 'Espresso' package employs a novel variant calling approach that models sequencing error distributions across 192 trinucleotide contexts and conducts variant calling by comparing each putative variant to its corresponding contextual error distribution. This demonstrates superior sensitivity and specificity over existing variant calling methods and bolsters our ability to accurately distinguish signal from noise at very low VAFs.
Some Espresso dependencies are from bioconductor and not CRAN, so you may need to install these extra packages first:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c("BiocGenerics", "BSgenome.Hsapiens.UCSC.hg19", "BSgenome.Hsapiens.UCSC.hg38", "VariantAnnotation", "GenomicScores", "maftools", "cellbaseR"))
To install Espresso, open R and install directly from github using the following commands:
library(devtools)
install_github("abelson-lab/Espresso")
To annotate variants with minor allele frequencies, download the appropriate MafDB annotation package.
# gnomAD exomes release 2.1 - hg19
BiocManager::install(c("MafDb.gnomADex.r2.1.hs37d5"))
# gnomAD exomes - hg38
BiocManager::install("MafDb.gnomADex.r2.0.1.GRCh38")
### For MAF annotation from other databases (1Kgenomes, ExAc, etc) or specific to the GRCh38 reference, see link above
If install_github is not working (newer versions of devtools may have issues with the formatting of the DESCRIPTION file), then try:
library(remotes); install_url(url="https://github.com/abelson-lab/Espresso/archive/master.zip", INSTALL_opt= "--no-multiarch")
Espresso takes in files generated by VarScan through the pileup2cns command, which generates one pileup file for each sample and includes all positions that met a minimum coverage. This way, Espresso leverages miscalled alleles generated by sequencing error in order to generate context specific error models and call low-VAF variants with more confidence.
Here is an R notebook outlining the Espresso variant calling workflow: Espresso Workflow
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.