featureCoverage | R Documentation |
Computes read coverage along single and multi component features based on
genomic alignments. The coverage segments of component features are spliced to
continuous ranges, such as exons to transcripts or CDSs to ORFs. The results
can be obtained with single nucleotide resolution (e.g. around start and stop
codons) or as mean coverage of relative bin sizes, such as 100 bins for each
feature. The latter allows comparisons of coverage trends among transcripts of
variable length. The results can be obtained for single or many features (e.g.
any number of transcritpts) at once. Visualization of the coverage results is
facilitated by a downstream plotfeatureCoverage
function.
featureCoverage(bfl, grl, resizereads = NULL, readlengthrange = NULL, Nbins = 20,
method = mean, fixedmatrix, resizefeatures, upstream, downstream,
outfile, overwrite = FALSE)
bfl |
Paths to BAM files provided as |
grl |
Genomic ranges provided as |
resizereads |
Positive integer defining the length alignments should be resized to prior to the
coverage calculation. |
readlengthrange |
Positive integer of length 2 determining the read length range to use for
the coverage calculation. Reads falling outside of the specified length range
will be excluded from the coverage calculation. For instance,
|
Nbins |
Single positive integer defining the number of segments the coverage of each feature should be binned into in order to obtain coverage summaries of constant length, e.g. for plotting purposes. |
method |
Defines the summary statistics to use for binning. The default is |
fixedmatrix |
If set to |
resizefeatures |
Needs to be set to |
upstream |
Single positive integer specifying the upstream extension length relative to the orientation of each feature in the genome. More details are given above. |
downstream |
Single positive integer specifying the downstream extension length relative to the orientation of each feature in the genome. More details are given above. |
outfile |
Default |
overwrite |
If set to |
The function allows to return the following four distinct outputs. The settings to return these instances are illustrated below in the example section.
(A) |
|
(B) |
|
(C) |
|
(D) |
|
Thomas Girke
plotfeatureCoverage
## Construct SYSargs2 object from param and targets files
targetspath <- system.file("extdata", "targets.txt", package="systemPipeR")
dir_path <- system.file("extdata/cwl", package="systemPipeR")
args <- loadWorkflow(targets=targetspath, wf_file="hisat2/hisat2-mapping-se.cwl",
input_file="hisat2/hisat2-mapping-se.yml", dir_path=dir_path)
args <- renderWF(args, inputvars=c(FileName="_FASTQ_PATH1_", SampleName="_SampleName_"))
args
## Not run:
## Features from sample data of systemPipeRdata package
library(txdbmaker)
file <- system.file("extdata/annotation", "tair10.gff", package="systemPipeRdata")
txdb <- makeTxDbFromGFF(file=file, format="gff3", organism="Arabidopsis")
targetspath <- system.file("extdata", "targets.txt", package="systemPipeR")
dir_path <- system.file("extdata/cwl", package="systemPipeR")
args <- loadWorkflow(targets=targetspath, wf_file="hisat2/hisat2-mapping-se.cwl",
input_file="hisat2/hisat2-mapping-se.yml", dir_path=dir_path)
args <- renderWF(args, inputvars=c(FileName="_FASTQ_PATH1_", SampleName="_SampleName_"))
args <- runCommandline(args, make_bam = TRUE, dir = TRUE)
outpaths <- subsetWF(args , slot="output", subset=1, index=1)
file.exists(outpaths)
## (A) Generate binned coverage for two BAM files and 4 transcripts
grl <- cdsBy(txdb, "tx", use.names=TRUE)
fcov <- featureCoverage(bfl=BamFileList(outpaths[1:2]), grl=grl[1:4], resizereads=NULL,
readlengthrange=NULL, Nbins=20, method=mean, fixedmatrix=FALSE,
resizefeatures=TRUE, upstream=20, downstream=20,
outfile="results/featureCoverage.xls", overwrite=TRUE)
plotfeatureCoverage(covMA=fcov, method=mean, scales="fixed", scale_count_val=10^6)
## (B) Coverage matrix upstream and downstream of start/stop codons
fcov <- featureCoverage(bfl=BamFileList(outpaths[1:2]), grl=grl[1:4], resizereads=NULL,
readlengthrange=NULL, Nbins=NULL, method=mean, fixedmatrix=TRUE,
resizefeatures=TRUE, upstream=20, downstream=20,
outfile="results/featureCoverage_up_down.xls", overwrite=TRUE)
plotfeatureCoverage(covMA=fcov, method=mean, scales="fixed", scale_count_val=10^6)
## (C) Combined matrix for both binned and start/stop codon
fcov <- featureCoverage(bfl=BamFileList(outpaths[1:2]), grl=grl[1:4], resizereads=NULL,
readlengthrange=NULL, Nbins=20, method=mean, fixedmatrix=TRUE,
resizefeatures=TRUE, upstream=20, downstream=20,
outfile="results/featureCoverage_binned.xls", overwrite=TRUE)
plotfeatureCoverage(covMA=fcov, method=mean, scales="fixed", scale_count_val=10^6)
## (D) Rle coverage objects one for each query feature
fcov <- featureCoverage(bfl=BamFileList(outpaths[1:2]), grl=grl[1:4], resizereads=NULL,
readlengthrange=NULL, Nbins=NULL, method=mean, fixedmatrix=FALSE,
resizefeatures=TRUE, upstream=20, downstream=20,
outfile="results/featureCoverage_query.xls", overwrite=TRUE)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.