count.everted.reads: Count the number of everted reads for a set of BAM files.
In ExomeDepth: Calls Copy Number Variants from Targeted Sequence Data

count.everted.reads

R Documentation

Count the number of everted reads for a set of BAM files.

Description

This is the ExomeDepth high level function that takes a GenomicRanges object, a list of indexed/sorted BAM files, and compute the number of everted reads in each of the defined bins.

Usage

count.everted.reads(
  bed.frame = NULL,
  bed.file = NULL,
  bam.files,
  index.files = bam.files,
  min.mapq = 20,
  include.chr = FALSE
)

Arguments

`bed.frame`	`data.frame` containing the definition of the regions. The first three columns must be chromosome, start, end.
`bed.file`	`character` file name. Target BED file with the definition of the regions. This file will only be used if no bed.frame argument is provided. No headers are assumed so remove them if they exist. Either a bed.file or a bed.frame must be provided for this function to run.
`bam.files`	`character`, list of BAM files to extract read count data from.
`index.files`	Optional `character` argument with the list of indexes for the BAM files, without the '.bai' suffix. If the indexes are simply obtained by adding .bai to the BAM files, this argument does not need to be specified.
`min.mapq`	`numeric`, minimum mapping quality to include a read.
`include.chr`	`logical`, if set to TRUE, this function will add the string 'chr' to the chromosome names of the target BED file.

Details

Everted reads are characteristic of the presence of duplications in a BAM files. This routine will parse a BAM files and the suggested use is to provide relatively large bins (for example gene based, and ExomeDepth has a genes.hg19 object that is appropriate for this) to flag the genes that contain such reads suggestive of a duplication. A manual check of the data using IGV is recommended to confirm that these reads are all located in the same DNA region, which would confirm the presence of a copy number variant.

Value

A data frame that contains the region and the number of identified reads in each bin.

Note

This function calls a lower level function called XXX that works on each single BAM file.

References

Medvedev et al (2009) <https://doi.org/10.1038/nmeth.1374> "Computational methods for discovering structural variation with next-generation sequencing"

Examples


data(genes.hg19)
bam_file <- system.file('extdata/minimum_1_25630000_25650000.bam',
                        package = 'ExomeDepth')
genes.hg19.TTC <- subset(genes.hg19, grepl(pattern = '^TTC34', genes.hg19[['name']]))
print(count.everted.reads (bed.frame = genes.hg19.TTC, bam.files = bam_file, min.mapq = 0))
print(count.everted.reads (bed.frame = genes.hg19.TTC, bam.files = bam_file, min.mapq = 35))

ExomeDepth documentation built on Nov. 3, 2022, 5:05 p.m.