Shearwater ML

Shearwater ML

Inigo Martincorena and Moritz Gerstung

6 March 2015

ShearwaterML is a maximum-likelihood adaptation of the original Shearwater algorithm. Unlike the original algorithm, ShearwaterML does not use prior information and yields p-values, instead of Bayes factors, using a Likelihood-Ratio Test. This allows using standard multiple testing correction methods to obtain a list of significant variants with a controlled false discovery rate.

For a detailed description of the algorithm see:

Martincorena I, Roshan A, Gerstung M, et al. (2015). High burden and pervasive positive selection of somatic mutations in normal human skin. Science (2015).

Load data from deepSNV example

library(deepSNV)
regions <- GRanges("B.FR.83.HXB2_LAI_IIIB_BRU_K034", IRanges(start = 3120, end=3140))
files <- c(system.file("extdata", "test.bam", package="deepSNV"), system.file("extdata", "control.bam", package="deepSNV"))
counts <- loadAllData(files, regions, q=30)

ShearwaterML: "betabinLRT" calculates p-values for each possible mutation

pvals <- betabinLRT(counts, rho=1e-4, maxtruncate = 1)$pvals
qvals <- p.adjust(pvals, method="BH")
dim(qvals) = dim(pvals)
vcfML = qvals2Vcf(qvals, counts, regions, samples = files, mvcf = TRUE)

Original Shearwater: "bbb" computes the original Bayes factors

bf <- bbb(counts, model = "OR", rho=1e-4)
vcfBF <- bf2Vcf(bf, counts, regions, samples = files, prior = 0.5, mvcf = TRUE)

plot(pvals[1,,], bf[1,,]/(1+bf[1,,]), log="xy")


Try the deepSNV package in your browser

Any scripts or data that you put into this service are public.

deepSNV documentation built on Nov. 8, 2020, 8:01 p.m.