reportBadRegionsGenes: Sums up the coverage quality on a gene basis

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/reports.R

Description

The function reportBadRegionsGenes creates a summary report considering the coverage quality on a genewise level. Taking the output of reportBadRegionsSummary as an input, the coverage quality for every previously identified gene is reported.

Usage

1
2
reportBadRegionsGenes(threshold1, threshold2, percentage1, percentage2, 
                      badCoverageSummary, output)

Arguments

threshold1

Integer, threshold defining the number of reads that have to be registered for a sample that its coverage is classified as acceptable.

threshold2

Integer, threshold defining the number of reads that have to be registered for a sample that its coverage is classified as good.

percentage1

Float, defining the percentage of samples that have to feature a coverage of at least threshold1 so that the position is classified as acceptably covered.

percentage2

Float, defining the percentage of samples that have to feature a coverage of at least threshold2 so that the position is classified as well covered.

badCoverageSummary

Data frame object, return value of function reportBadRegionsSummary.

output

The folder to write the output file into. If output is just an empty string, no output file is written out.

Details

To gain an overview of the coverage quality of each targeted/covered gene, a summary file may be created by the function reportBadRegionsGenes. The function takes the output of reportBadRegionsSummary as an input.

All regions covering the same gene are summed up in the following way:

The number of bases falling into each quality category is summed up. Thereby, regions which were orignially targeted may easily be separated from those which were not, as targeted regions always feature an uneven number characterizing their coverage quality. If a region is broader than the detected gene, but the quality category is the same for the whole region, the whole region is assigned to the gene.

If no gene is reported in the input file, the coverage quality is summed up for a gene named "NA".

The output file is saved as: "BadCoverageGenesthreshold1;percentage1;threshold2;percentage2.txt". The output file may be visualized using plotSummaryGenes.

Value

A data frame object is returned. The first column contains the name and the geneID of the gene. The following columns contain the percentage of bases falling into the following categories: bad region off traget, bad region on target, acceptable region off target, acceptable region on target, good region off target, good region on target.

Author(s)

Sarah Sandmann <sarah.sandmann@uni-muenster.de>

See Also

BadRegionFinder, determineCoverage, determineCoverageQuality, determineRegionsOfInterest, reportBadRegionsSummary, reportBadRegionsDetailed, plotSummary, plotDetailed, plotSummaryGenes, determineQuantiles

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
library("BSgenome.Hsapiens.UCSC.hg19")

threshold1 <- 20
threshold2 <- 100
percentage1 <- 0.80
percentage2 <- 0.90
sample_file <- system.file("extdata", "SampleNames.txt", 
                           package = "BadRegionFinder")
samples <- read.table(sample_file)
bam_input <- system.file("extdata", package = "BadRegionFinder")
output <- system.file("extdata", package = "BadRegionFinder")
target_regions <- system.file("extdata", "targetRegions.bed",
                              package = "BadRegionFinder")
targetRegions <- read.table(target_regions, header = FALSE,
                            stringsAsFactors = FALSE)

coverage_summary <- determineCoverage(samples, bam_input, targetRegions, output,
                                      TRonly = TRUE)
coverage_indicators <- determineCoverageQuality(threshold1, threshold2,
                                                percentage1, percentage2,
                                                coverage_summary)
badCoverageSummary <- reportBadRegionsSummary(threshold1, threshold2, 
                                              percentage1, percentage2,
                                              coverage_indicators, "",output)
badCoverageGenes <- reportBadRegionsGenes(threshold1, threshold2, percentage1,
                                          percentage2, badCoverageSummary, output)

BadRegionFinder documentation built on Nov. 8, 2020, 5:24 p.m.