bin_fragment_size: Gets histogram of fragment lengths from a bam file

Description Usage Arguments Details Value See Also Examples

View source: R/bin_fragment_size.R

Description

The function first extracts fragment length from a bam file then computes the histogram over defined bins. If normalized is TRUE, the counts per bin will be normalized to the total read counts. Optionally, it can computes the histogram of fragment lengths only for mutated reads (confirmed ctDNA molecules).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
bin_fragment_size(
  bam,
  mutations = NULL,
  targets = NULL,
  tag = "",
  bin_size = 2,
  custom_bins = NULL,
  normalized = FALSE,
  min_size = 1,
  max_size = 400,
  ...
)

Arguments

bam

path to bam file.

mutations

An optional data frame with mutations. Must have the columns CHROM, POS, REF, ALT.

targets

a data frame with the target regions to restrict the reads in the bam. Must have three columns: chr, start and end

tag

the RG tag if the bam has more than one sample.

bin_size

the width of the bin (breaks) of the histogram.

custom_bins

A numeric vector for custom breaks to bin the histogram of fragment length. Over-rides bin_size.

normalized

A logical, whether to normalize the counts to the total number of reads.

min_size

Integer with the lowest fragment length.

max_size

Integer with the highest fragment length.

...

Other parameters passed to get_fragment_size.

Details

Fragment length will extracted from the bam file according to the parameters passed to get_fragment_size, and histogram counts (optionally normalized to total counts) are computed. Both equal histogram bins via bin_size and manually customized bins via custom_bins are supported.

By using an input mutations, the function will bin separately the reads that support variant alleles, reference alleles and other reads.

Value

A data frame with one column for the used breaks and one having the histogram (normalized) counts. If mutations is supplied, the output will have one breaks column and three columns corresponding to variant allele reads, reference allele reads, and other reads. Each row has the count of fragment lengths within the bin and optionally normalized by the total number of reads.

See Also

get_fragment_size analyze_fragmentation summarize_fragment_size

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
data("targets", package = "ctDNAtools")
data("mutations", package = "ctDNAtools")
bamT1 <- system.file("extdata", "T1.bam", package = "ctDNAtools")

## basic usage
bin_fragment_size(bam = bamT1)

## with normalization
bin_fragment_size(bam = bamT1, normalized = TRUE)


## binning reads categorized based on mutations ref and alt
bin_fragment_size(bam = bamT1, mutations = mutations)

## Restrict to reads into targets
bin_fragment_size(bam = bamT1, targets = targets)

alkodsi/ctDNAtools documentation built on Feb. 22, 2022, 9:40 a.m.