get_background_rate: Computes the background mismatch rate from a bam file

Description Usage Arguments Details Value See Also Examples

View source: R/get_background_rate.R

Description

Runs through the target regions base by base counting the mismatches. Then it divides sum(mismatches)/sum(depths) for all bases in the targets

Usage

1
2
3
get_background_rate(bam, targets, reference, vaf_threshold = 0.1,
  tag = "", black_list = NULL, substitution_specific = TRUE,
  min_base_quality = 20, max_depth = 1e+05, min_mapq = 30)

Arguments

bam

path to bam file

targets

a data frame with the target regions. Must have three columns: chr, start and end

reference

the reference genome in BSgenome format

vaf_threshold

the bases with higher than this VAF threshold will be ignored in the calculation (real mutations)

tag

the RG tag if the bam has more than one sample

black_list

a character vector of genomic loci of format chr_pos if substitution_specific is false, or chr_pos_ref_alt if substitution_specific is true. The background will be computed on the target regions after excluding black_list loci.

substitution_specific

logical, whether to have the loci of black_list by substitutions.

min_base_quality

minimum base quality for a read to be counted

max_depth

maximum depth above which sampling will happen

min_mapq

the minimum mapping quality for a read to be counted

Details

Computes the background rate of the input bam file for all bases in the specified targets. Substitutions-specific rates are also calculated.

Genomic positions having non-reference allele frequency higher than vaf_threshold will be excluded (to exclude SNPs and real mutations).

If a black_list is specified, the positions in the black_list (whether substitution_specific or not) will be excluded before computing the background rate.

Value

a list containing the general mismatch rate and substitution-specific rates

See Also

create_black_list test_ctDNA create_background_panel

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
## Load example data
data("targets", package = "ctDNAtools")
bamT1 <- system.file("extdata", "T1.bam", package = "ctDNAtools")
bamN1 <- system.file("extdata", "N1.bam", package = "ctDNAtools")
bamN2 <- system.file("extdata", "N2.bam", package = "ctDNAtools")
bamN3 <- system.file("extdata", "N3.bam", package = "ctDNAtools")

## Use human reference genome from BSgenome.Hsapiens.UCSC.hg19 library
suppressMessages(library(BSgenome.Hsapiens.UCSC.hg19))

## basic usage
get_background_rate(bamT1, targets, BSgenome.Hsapiens.UCSC.hg19)

## more options
get_background_rate(bamT1, targets, BSgenome.Hsapiens.UCSC.hg19,
  min_base_quality = 30, min_mapq = 40, vaf_threshold = 0.05
)

## with blacklist
bg_panel <- create_background_panel(
  bam_list = c(bamN1, bamN2, bamN3),
  targets = targets, reference = BSgenome.Hsapiens.UCSC.hg19,
  substitution_specific = TRUE
)

bl2 <- create_black_list(bg_panel,
  mean_vaf_quantile = 0.99,
  min_samples_one_read = 2, min_samples_two_reads = 1
)

get_background_rate(bamT1, targets, BSgenome.Hsapiens.UCSC.hg19,
  black_list = bl2
)

ctDNAtools documentation built on March 26, 2020, 7:39 p.m.