determineCoverageQuality: Classifies the determined coverage

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/BadRegionFinder.R

Description

The previously determined coverage (using determineCoverage with TRonly = TRUE or TRonly = FALSE) for all samples gets combined to be classified into six categories: bad coverage off target, bad coverage on target, acceptable coverage off target, acceptable coverage on target, good coverage off target, good coverage on target. These categories are user-defined.

Usage

1
2
determineCoverageQuality(threshold1, threshold2, percentage1, percentage2, 
                         coverage_summary)

Arguments

threshold1

Integer, threshold defining the number of reads that have to be registered for a sample that its coverage is classified as acceptable.

threshold2

Integer, threshold defining the number of reads that have to be registered for a sample that its coverage is classified as good. To obtain useful results, threshold2 has to be greater than threshold1.

percentage1

Float, defining the percentage of samples that have to feature a coverage of at least threshold1 so that the position is classified as acceptably covered.

percentage2

Float, defining the percentage of samples that have to feature a coverage of at least threshold2 so that the position is classified as well covered. To obtain useful results, percentage2 should be greater than zero.

coverage_summary

GRangesList object, return value of function determineCoverage.

Details

Every chromosome is analyzed individually. First, the coverage of each sample is categorized according to threshold1 and threshold2 into three different categories:

bad coverage: less than threshold1 reads

acceptable coverage: at least threshold1, but less than threshold2 reads

good coverage: at least threshold2 reads

Subsequently this information gets combined with the defined precentages to obtain a numerically coded quality value for each region saved in the previously created list object coverage_summary:

0: off target; not even percentage1 percent of all samples have a good or acceptable coverage (bad region)

1: on target; not even percentage1 percent of all samples have a good or acceptable coverage (bad region)

2: off target; at least percentage1 percent of all samples have a good or acceptable coverage, but less than percentage2 percent of all samples have a good coverage (acceptable region)

3: on target; at least percentage1 percent of all samples have a good or acceptable coverage, but less than percentage2 percent of all samples have a good coverage (acceptable region)

4: off target; at least percentage2 percent of all samples have a good coverage (good region)

5: on target; at least percentage2 percent of all samples have a good coverage (good region)

Value

A list is returned. Every component contains the coverage information of one chromosome as a GRanges object. The metadata columns contain information on the concrete coverage of each sample at a specific position. Furthermore, the column 'TargetBases' contains information on whether the considered region or position contains target bases (value 1) or not (value 0). The column 'indicator' contains information on the coverage quality of the corresponding region/position.

If a chromosome is not covered and was not targeted as well, the GRanges object solely contains a single line, considering a whole chromosome if TRonly=FALSE. If TRonly=TRUE the corresponding component is "NA".

Author(s)

Sarah Sandmann <sarah.sandmann@uni-muenster.de>

See Also

BadRegionFinder, determineCoverage, determineRegionsOfInterest, reportBadRegionsSummary, reportBadRegionsDetailed, reportBadRegionsGenes, plotSummary, plotDetailed, plotSummaryGenes, determineQuantiles

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
threshold1 <- 20
threshold2 <- 100
percentage1 <- 0.80
percentage2 <- 0.90
sample_file <- system.file("extdata", "SampleNames.txt", 
                           package = "BadRegionFinder")
samples <- read.table(sample_file)
bam_input <- system.file("extdata", package = "BadRegionFinder")
output <- system.file("extdata", package = "BadRegionFinder")
target_regions <- system.file("extdata", "targetRegions.bed",
                              package = "BadRegionFinder")
targetRegions <- read.table(target_regions, header = FALSE,
                            stringsAsFactors = FALSE)


coverage_summary <- determineCoverage(samples, bam_input, targetRegions, output,
                                      TRonly = FALSE)
coverage_indicators <- determineCoverageQuality(threshold1, threshold2,
                                                percentage1, percentage2,
                                                coverage_summary)
                                              

BadRegionFinder documentation built on Nov. 8, 2020, 5:24 p.m.