summarize_fragment_size: Summarizes fragment size in defined genomic regions

Description Usage Arguments Details Value See Also Examples

View source: R/summarize_fragment_size.R

Description

Summarizes fragment size in defined genomic regions

Usage

1
2
summarize_fragment_size(bam, regions, tag = "",
  summary_functions = list(Mean = mean, Median = median), ...)

Arguments

bam

the input bam file

regions

data frame containing the genomic regions. Must have the columns chr, start and end.

tag

the RG tag if the bam has more than one sample.

summary_functions

a named list containing the R functions used for summarization, e.g. mean, sd.

...

Other parameters passed to get_fragment_size

Details

Fragment size for reads that are paired (optionally properly paired), whose both mates are mapped, not secondary or supplementary alignment, not duplicates, passed quality control, and satisfy mapq threshold will be used for summarization. The reads that overlap the specified regions will be summarized by the specified summary_functions. Overlaps consider fragments to span the left most to the right most coordinate from either the read or the mate. Minimum and maximum bounds of the fragment size will be applied before summarization.

Value

a data frame with the first column having the regions in the format of chr:start-end, and other columns correspond to summary_functions.

See Also

get_fragment_size bin_fragment_size analyze_fragmentation

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
data("targets", package = "ctDNAtools")
bamT1 <- system.file("extdata", "T1.bam", package = "ctDNAtools")

## binning the target in arbitrary way
## Note that regions don't need to be bins,
## they can be any regions in the genome
regions <- data.frame(
  chr = targets$chr,
  start = seq(from = targets$start - 200, to = targets$end + 200, by = 30),
  stringsAsFactors = FALSE
)
regions$end <- regions$start + 50

## basic usage
sfs <- summarize_fragment_size(bam = bamT1, regions = regions)

## different summary functions
sfs <- summarize_fragment_size(
  bam = bamT1, regions = regions,
  summary_functions = list(
    Var = var, SD = sd,
    meanSD = function(x) mean(x) / sd(x)
  )
)

ctDNAtools documentation built on March 26, 2020, 7:39 p.m.