reportBadRegionsSummary: Sums up the coverage quality

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/reports.R

Description

The function reportBadRegionsSummary creates a summary report containing all regions of interest, their coverage quality and the corresponding gene (name and geneID).

Usage

1
2
reportBadRegionsSummary(threshold1, threshold2, percentage1, percentage2, 
                        coverage_indicators, mart, output)

Arguments

threshold1

Integer, threshold defining the number of reads that have to be registered for a sample that its coverage is classified as acceptable.

threshold2

Integer, threshold defining the number of reads that have to be registered for a sample that its coverage is classified as good.

percentage1

Float, defining the percentage of samples that have to feature a coverage of at least threshold1 so that the position is classified as acceptably covered.

percentage2

Float, defining the percentage of samples that have to feature a coverage of at least threshold2 so that the position is classified as well covered.

coverage_indicators

List object, return value of function determineCoverageQuality or determineRegionsOfInterest.

mart

mart as defined in the manual for package 'biomaRt'. If the human genome (hg19) shall be used, an empty string may be provided and the mart is automatically generated.

output

The folder to write the output file into. If output is just an empty string, no output file is written out.

Details

To gain an overview of the coverage quality, a summary file may be created by the function reportBadRegionsSummary. The function may either take information on the whole genome (output from determineCoverage with TRonly=FALSE, processed using determineCoverageQuality) as an input, or information on the target regions (output from determineCoverage with TRonly=TRUE, processed using determineCoverageQuality), or information on a selection of regions of interest (output from determineRegionsOfInterest).

Wherever subsequent bases feature the same coverage quality, the region gets summed up. Although it is not directly reported whether a region contains on or off target bases, this information can be gained from the coverage quality: all bases off target feature an even number characterizing the coverage quality; all bases on target feature an uneven number characterizing the coverage quality.

For each summed up region the gene that is most likely to be targeted by the original experiment gets reported using biomaRt. If no gene can be found, "NA" is saved for the corresponding region. If not all bases in the summed up region cover a gene, the gene gets reported for the whole region nonetheless.

The output file is saved as: "BadCoverageSummarythreshold1;percentage1;threshold2;percentage2.txt". The output file may be visualized using plotSummary.

Value

A GRanges object is returned. It represents a summary of the those adjacent regions that feature the same base quality. In the metadata columns the coverage quality of the region, the name and the geneID of the gene that is located in the corresponding region is saved.

Author(s)

Sarah Sandmann <sarah.sandmann@uni-muenster.de>

References

More information on the R/Bioconductor package 'biomaRt' may be found at:

http://www.bioconductor.org/packages/release/bioc/html/biomaRt.html

See Also

BadRegionFinder, determineCoverage, determineCoverageQuality, determineRegionsOfInterest, reportBadRegionsDetailed, reportBadRegionsGenes, plotSummary, plotDetailed, plotSummaryGenes, determineQuantiles

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
library("BSgenome.Hsapiens.UCSC.hg19")

threshold1 <- 20
threshold2 <- 100
percentage1 <- 0.80
percentage2 <- 0.90
sample_file <- system.file("extdata", "SampleNames.txt", 
                           package = "BadRegionFinder")
samples <- read.table(sample_file)
bam_input <- system.file("extdata", package = "BadRegionFinder")
output <- system.file("extdata", package = "BadRegionFinder")
target_regions <- system.file("extdata", "targetRegions.bed",
                              package = "BadRegionFinder")
targetRegions <- read.table(target_regions, header = FALSE,
                            stringsAsFactors = FALSE)

coverage_summary <- determineCoverage(samples, bam_input, targetRegions, output,
                                      TRonly = TRUE)
coverage_indicators <- determineCoverageQuality(threshold1, threshold2,
                                                percentage1, percentage2,
                                                coverage_summary)
badCoverageSummary <- reportBadRegionsSummary(threshold1, threshold2, percentage1,
                                              percentage2, coverage_indicators,
                                              "", output)

BadRegionFinder documentation built on Nov. 8, 2020, 5:24 p.m.