determineCoverage: Determines the coverage (recommended for whole-genome...

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/BadRegionFinder.R

Description

BadRegionFinder performs a coverage analysis of various samples at a time. The first, essential step of the analysis pipeline – the coverage determination – is performed by the function determineCoverage. Thereby, the whole genome is scanned and wherever a covered base is registered or an originally targeted base is detected, detailed information concerning this position is written out.

Usage

1
determineCoverage(samples, bam_input, targetRegions, output, TRonly)

Arguments

samples

Data frame object containing the names of the samples to be analyzed (in one column).

bam_input

Folder containing the alignment data in bam- and bai format or BamFileList.

targetRegions

Data frame- or GRanges object containing the target regions to be analyzed (chromosome: first column, start position: second column and end position: third column).

output

The folder to write the output files into. If an empty string is provided, no files are written out.

TRonly

Boolean, indicating whether the coverage of the whole genome should be analyzed and reported (FALSE) or the coverage of the target regions only (TRUE).

Details

The coverage which is determined by the function determineCoverage contains different steps:

For every sample that is defined in samples, the coverage is determined using the function coverage ("Determine Coverage"). To combine information on the coverage with information on whether a set of bases were originally targeted by some sequencing experiment, the targetRegions get processed ("Determine target bases"). Finally, the information gets combined ("Combine information"): Those positions where no sample shows any coverage and no target base is registered, are summed up. All other positions are reported basewise.

Files get written out in the form: "Summary_chr<chromosomename>.txt".

As sequencing does often not mean whole-genome- or whole-exome sequencing, but targeted sequencing, the function determienCoverage contains a switch: TRonly. In case misaligned reads in a targeted sequencing experiment shall be analyzed, it is advisable to set TRonly to FALSE. Yet, if only the coverage of the targeted regions are of interest, it is advisable to set TRonly to TRUE.

Value

A GRangesList is returned. Every GRanges object contains the coverage information of one chromosome. The metadata columns contain information on the concrete coverage of each sample at a specific position. Furthermore, the column 'TargetBases' contains information on whether the considered region or position contains target bases (value 1) or not (value 0). A region cannot contain both as two regions would be defined in this case.

If a chromosome is not covered and was not targeted as well, the GRanges object solely contains a single line, considering a whole chromosome if TRonly=FALSE. If TRonly=TRUE the starting and end position of the corresponding chromosome is set to zero.

Author(s)

Sarah Sandmann <sarah.sandmann@uni-muenster.de>

See Also

BadRegionFinder, determineCoverageQuality, determineRegionsOfInterest, reportBadRegionsSummary, reportBadRegionsDetailed, reportBadRegionsGenes, plotSummary, plotDetailed, plotSummaryGenes, determineQuantiles

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
sample_file <- system.file("extdata", "SampleNames.txt", 
                           package = "BadRegionFinder")
samples <- read.table(sample_file)
bam_input <- system.file("extdata", package = "BadRegionFinder")
output <- system.file("extdata", package = "BadRegionFinder")
target_regions <- system.file("extdata", "targetRegions.bed",
                              package = "BadRegionFinder")
targetRegions <- read.table(target_regions, header = FALSE,
                            stringsAsFactors = FALSE)
                            

coverage_summary <- determineCoverage(samples, bam_input, targetRegions, output,
                    TRonly = FALSE)

BadRegionFinder documentation built on Nov. 8, 2020, 5:24 p.m.