Description Usage Arguments Details Value Author(s) References See Also Examples
The function reportBadRegionsDetailed
creates a detailed report containing all regions of interest (basewise), the coverage of each sample at the corresponding positions, the indicator whether the bases were originally targeted, their coverage quality and the corresponding gene (name and geneID).
1 2 | reportBadRegionsDetailed(threshold1, threshold2, percentage1, percentage2,
coverage_indicators, mart, samples, output)
|
threshold1 |
Integer, threshold defining the number of reads that have to be registered for a sample that its coverage is classified as acceptable. |
threshold2 |
Integer, threshold defining the number of reads that have to be registered for a sample that its coverage is classified as good. |
percentage1 |
Float, defining the percentage of samples that have to feature a coverage of at least |
percentage2 |
Float, defining the percentage of samples that have to feature a coverage of at least |
coverage_indicators |
List object, return value of function |
mart |
mart as defined in the manual for package 'biomaRt'. If the human genome (hg19) shall be used, an empty string may be provided and the mart is automatically generated. |
samples |
Data frame object containing the names of the samples to be analyzed (in one column). |
output |
The folder to write the output files into. If |
To gain more detailed information of the coverage quality, a file for every chromosome to be analyzed may be created by the function reportBadRegionsDetailed
. The function may either take information on the whole genome (output from determineCoverage
with TRonly=FALSE
, processed using determineCoverageQuality
) as an input, or information on the target regions (output from determineCoverage
with TRonly=TRUE
, processed using determineCoverageQuality
), or information on a selection of regions of interest (output from determineRegionsOfInterest
).
Different from the summed-up variant reportBadRegionsSummary
, information on every single base of interest gets reported (except for completely uncovered and untargeted regions, which are summed up). For every base its position, the coverage of each sample, information on whether this base was originally targeted (value 1) or not (value 0), the coverage quality and the most likely gene (name and geneID) that was targeted by the original experiment get reported. Information on the gene names and the geneIDs results from biomaRt. If no gene can be found at a position, "NA" is reported for the corresponding base.
The output files are saved as: "BadCoverageChromosome<chromosomename>;threshold1
;percentage1
;threshold2
;percentage2
.txt". The output file may be visualized using plotDetailed
.
A list is returned. Every component contains the coverage information of one chromosome as a GRanges object. The metadata columns contain information on the concrete coverage of each sample at a specific position. Furthermore, the column 'TargetBases' contains information on whether the considered region or position contains target bases (value 1) or not (value 0). The column 'indicator' contains information on the coverage quality of the corresponding region/position (0: bad region off target; 1: bad region on target; 2: acceptable region off target; 3: acceptable region on target; 4: good region off target; 5: good region on target). Furthermore, the name and the geneID of the gene that is located at the corresponding position is saved.
If a chromosome is not covered and was not targeted as well, the component is "NA".
Sarah Sandmann <sarah.sandmann@uni-muenster.de>
More information on the R/Bioconductor package 'biomaRt' may be found at:
http://www.bioconductor.org/packages/release/bioc/html/biomaRt.html
BadRegionFinder
, determineCoverage
, determineCoverageQuality
, determineRegionsOfInterest
, reportBadRegionsSummary
, reportBadRegionsGenes
, plotSummary
, plotDetailed
, plotSummaryGenes
, determineQuantiles
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | library("BSgenome.Hsapiens.UCSC.hg19")
threshold1 <- 20
threshold2 <- 100
percentage1 <- 0.80
percentage2 <- 0.90
sample_file <- system.file("extdata", "SampleNames.txt",
package = "BadRegionFinder")
samples <- read.table(sample_file)
bam_input <- system.file("extdata", package = "BadRegionFinder")
output <- system.file("extdata", package = "BadRegionFinder")
target_regions <- system.file("extdata", "targetRegions.bed",
package = "BadRegionFinder")
targetRegions <- read.table(target_regions, header = FALSE,
stringsAsFactors = FALSE)
coverage_summary <- determineCoverage(samples, bam_input, targetRegions, output,
TRonly = TRUE)
coverage_indicators <- determineCoverageQuality(threshold1, threshold2,
percentage1, percentage2,
coverage_summary)
coverage_indicators_temp <- reportBadRegionsDetailed(threshold1, threshold2,
percentage1, percentage2,
coverage_indicators, "",
samples, output)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.