countRNA | R Documentation |
The FRASER package provides multiple functions to extract and count both split and non-spliced reads from bam files. See Detail and Functions for more information.
countRNAData(
fds,
NcpuPerSample = 1,
minAnchor = 5,
recount = FALSE,
BPPARAM = bpparam(),
genome = NULL,
junctionMap = NULL,
filter = TRUE,
minExpressionInOneSample = 20,
keepNonStandardChromosomes = TRUE,
countDir = file.path(workingDir(fds), "savedObjects", nameNoSpace(name(fds))),
...
)
getSplitReadCountsForAllSamples(
fds,
NcpuPerSample = 1,
junctionMap = NULL,
recount = FALSE,
BPPARAM = bpparam(),
genome = NULL,
countFiles = NULL,
keepNonStandardChromosomes = TRUE,
outDir = file.path(workingDir(fds), "savedObjects", nameNoSpace(name(fds)),
"splitCounts")
)
getNonSplitReadCountsForAllSamples(
fds,
splitCountRanges,
NcpuPerSample = 1,
minAnchor = 5,
recount = FALSE,
BPPARAM = bpparam(),
longRead = FALSE,
outDir = file.path(workingDir(fds), "savedObjects", nameNoSpace(name(fds)),
"nonSplitCounts")
)
addCountsToFraserDataSet(fds, splitCounts, nonSplitCounts)
countSplitReads(
sampleID,
fds,
NcpuPerSample = 1,
genome = NULL,
recount = FALSE,
keepNonStandardChromosomes = TRUE,
bamfile = bamFile(fds[, sampleID]),
pairedend = pairedEnd(fds[, sampleID]),
strandmode = strandSpecific(fds[, sampleID]),
cacheFile = getSplitCountCacheFile(sampleID, fds),
scanbamparam = scanBamParam(fds),
coldata = colData(fds)
)
mergeCounts(
countList,
fds,
junctionMap = NULL,
assumeEqual = FALSE,
spliceSiteCoords = NULL,
BPPARAM = SerialParam()
)
countNonSplicedReads(
sampleID,
splitCountRanges,
fds,
NcpuPerSample = 1,
minAnchor = 5,
recount = FALSE,
spliceSiteCoords = NULL,
longRead = FALSE
)
fds |
A |
NcpuPerSample |
A BiocParallel param object or a positive integer to configure the parallel backend of the internal loop per sample |
minAnchor |
Minimum overlap around the Donor/Acceptor for non spliced reads. Default to 5 |
recount |
if TRUE the cache is ignored and the bam file is recounted. |
BPPARAM |
the BiocParallel parameters for the parallelization |
genome |
NULL (default) or a character vector specifying the names of
the reference genomes that were used to align the reads for each sample.
The names have to be in a way accepted by the
|
junctionMap |
A object or file containing a map of all junctions of interest across all samples |
filter |
If TRUE, splice sites of introns with low read support in all samples are not considered when calculating the non-split reads. This helps to speed up the subsequent steps. |
minExpressionInOneSample |
The minimal split read count in at least one sample that is required for an intron to pass the filter. |
keepNonStandardChromosomes |
Logical value indicating if non standard
chromosomes should also be counted. Defaults to |
countDir |
The directory in which the tsv containing the position and counts of the junctions should be placed. |
... |
Further parameters passed on to Rsubread::featureCounts. |
countFiles |
If specified, the split read counts for all samples are read from the specified files. Should be a vector of paths to files containing the split read counts for the individual samples. Reading from files is only supported for tsv(.gz) or RDS files containing GRranges objects. The order of the individual sample files should correspond to the order of the samples in the fds. |
outDir |
The full path to the output folder containing the merged counts. If the given folder already exists and stores a SummarizedExperiment object, the counts from this folder will be read in and used in the following (i.e. the reads are not recounted), unless the option recount=TRUE is used. If this folder doesn't exist or if recount=TRUE, then it will be created after counting has finished. |
splitCountRanges |
The merged GRanges object containing the positions of all the introns in the dataset over all samples. |
longRead |
If TRUE, then the isLongRead option of Rsubread::featureCounts is used when counting the non spliced reads overlapping splice sites. |
splitCounts |
The SummarizedExperiment object containing the position and counts of all the introns in the dataset for all samples. |
nonSplitCounts |
The SummarizedExperiment object containing the position and non split read counts of all splice sites present in the dataset for all samples. |
sampleID |
The ID of the sample to be counted. |
bamfile |
The BAM file to be used to extract the counts. Defaults to
the BAM file defined in the |
pairedend |
|
strandmode |
0 (no, default), 1 (stranded), or 2 (revers) to specify the used protocol for the RNA-seq experiment. |
cacheFile |
File path to the cache, where counts are stored. |
scanbamparam |
The ScanBamParam object which is used for loading the
reads from the BAM file before counting. Defaults to the params
stored in the |
coldata |
The colData as given by the |
countList |
A list of GRanges objects containing the counts that should be merged into one object. |
assumeEqual |
Logical indicating whether all objects in
|
spliceSiteCoords |
A GRanges object containing the positions of the splice sites. If it is NULL, then splice sites coordinates are calculated first based on the positions of the junctions defined from the split reads. |
The functions described in this file extract and count both the split and the non-spliced reads from bam files.
countRNAData
is the main function that takes care of all
counting steps and returns a FraserDataSet containing the counts for all
samples in the fds.
getSplitReadCountsForAllSamples
counts split reads for all
samples and getNonSplitReadCountsForAllSamples
counts non
split reads overlapping splice sites for all samples.
addCountsToFraserDataSet
adds these counts to an existing fds.
countSplitReads
calculates the split read counts for a single
sample. countNonSplicedReads
counts the non split reads
overlapping with splice sites for a single sample.
mergeCounts
merges the counts from different samples into a
single count object, where the counts for junctions that are not present in
a sample are set to zero.
countRNAData
returns a FraserDataSet.
getSplitReadCountsForAllSamples
returns a GRanges
object.
getNonSplitReadCountsForAllSamples
returns a
GRanges object.
addCountsToFraserDataSet
returns a FraserDataSet.
countSplitReads
returns a GRanges object.
mergeCounts
returns a SummarizedExperiment object.
countNonSplicedReads
returns a GRanges object.
countRNAData()
: This method extracts and counts the split reads and
non spliced reads from RNA bam files.
getSplitReadCountsForAllSamples()
: This method creates a GRanges
object containing the split read counts from all
specified samples.
getNonSplitReadCountsForAllSamples()
: This method creates a GRanges
object containing the non split read counts at the
exon-intron boundaries inferred from the GRanges object
containing the positions of all the introns in this dataset.
addCountsToFraserDataSet()
: This method adds the split read and
non split read counts to a existing FraserDataSet
containing the settings.
countSplitReads()
: This method counts all split reads in a
bam file for a single sample.
mergeCounts()
: This method merges counts for multiple
samples into one SummarizedExperiment object.
countNonSplicedReads()
: This method counts non spliced reads based
on the given target (acceptor/donor) regions for a single sample.
# On Windows SNOW is the default for the parallele backend, which can be
# very slow for many but small tasks. Therefore, we will use
# for the example the SerialParam() backend.
if(.Platform$OS.type != "unix") {
register(SerialParam())
}
fds <- countRNAData(createTestFraserSettings())
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.