BadRegionFinder performs a coverage analysis of various samples at a time. The first, essential step of the analysis pipeline – the coverage determination – is performed by the function
determineCoverage. Thereby, the whole genome is scanned and wherever a covered base is registered or an originally targeted base is detected, detailed information concerning this position is written out.
determineCoverage(samples, bam_input, targetRegions, output, TRonly)
Data frame object containing the names of the samples to be analyzed (in one column).
Folder containing the alignment data in bam- and bai format or BamFileList.
Data frame- or GRanges object containing the target regions to be analyzed (chromosome: first column, start position: second column and end position: third column).
The folder to write the output files into. If an empty string is provided, no files are written out.
Boolean, indicating whether the coverage of the whole genome should be analyzed and reported (
The coverage which is determined by the function
determineCoverage contains different steps:
For every sample that is defined in
samples, the coverage is determined using the function
coverage ("Determine Coverage"). To combine information on the coverage with information on whether a set of bases were originally targeted by some sequencing experiment, the
targetRegions get processed ("Determine target bases"). Finally, the information gets combined ("Combine information"): Those positions where no sample shows any coverage and no target base is registered, are summed up. All other positions are reported basewise.
Files get written out in the form: "Summary_chr<chromosomename>.txt".
As sequencing does often not mean whole-genome- or whole-exome sequencing, but targeted sequencing, the function
determienCoverage contains a switch:
TRonly. In case misaligned reads in a targeted sequencing experiment shall be analyzed, it is advisable to set
FALSE. Yet, if only the coverage of the targeted regions are of interest, it is advisable to set
A GRangesList is returned. Every GRanges object contains the coverage information of one chromosome. The metadata columns contain information on the concrete coverage of each sample at a specific position. Furthermore, the column 'TargetBases' contains information on whether the considered region or position contains target bases (value 1) or not (value 0). A region cannot contain both as two regions would be defined in this case.
If a chromosome is not covered and was not targeted as well, the GRanges object solely contains a single line, considering a whole chromosome if
TRonly=TRUE the starting and end position of the corresponding chromosome is set to zero.
Sarah Sandmann <firstname.lastname@example.org>
1 2 3 4 5 6 7 8 9 10 11 12 13
sample_file <- system.file("extdata", "SampleNames.txt", package = "BadRegionFinder") samples <- read.table(sample_file) bam_input <- system.file("extdata", package = "BadRegionFinder") output <- system.file("extdata", package = "BadRegionFinder") target_regions <- system.file("extdata", "targetRegions.bed", package = "BadRegionFinder") targetRegions <- read.table(target_regions, header = FALSE, stringsAsFactors = FALSE) coverage_summary <- determineCoverage(samples, bam_input, targetRegions, output, TRonly = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.