View source: R/rearrangement-utils.R
findCandidates2 | R Documentation |
This function identifies clusters of improper reads that are linked by the mate information in paired read sequencing platforms such as Illumina HiSeq.
findCandidates2(preprocess, rp = RearrangementParams())
preprocess |
A list of preprocessing data as constructed by |
rp |
A |
All reads from improper read pairs where mates are separated by
at least 10kb and both reads in pair are mapped are read from the
AlignmentViews
object. A cluster of reads (all involved in
improper pairs) is defined as follows:
genomic intervals demarcating improper read clusters are gotten by applying reduce to a GRanges representation of all improper reads
genomic intervals must be at least 115bp and no larger than 5000bp (default settings)
each cluster must contain at least 5 reads
Non-overlapping clusters that are linked by multiple improper read
pairs are suggestive of a rearrangement. Linked tag clusters are
identified by the function seqJunctionsInferredByPairedTags
.
The genomic intervals defined by the linked tag clusters (also
referred to as linked bins) are represented as a GRanges
object with a variable called linked.to
in mcols
.
The linked.to
column is also a GRanges
object. The
GRanges
object of the linked clusters, the improper read
pairs supporting the link, and the set of all tags that map to
either linked genomic interval are encapsulated in a
Rearrangement
object. Statistics calculated on each
Rearrangement
object include the fraction of all reads link
the two clusters (fractionLinkingTags
), the types of
rearrangements supported (rearrangementType
), the modal
rearrangement, and the percent of read pairs supporting the modal
rearrangement. The collection of all linked clusters for a given
sample is represented as a RearrangementList
.
See seqJunctionsInferredByPairedTags2
for
additional details regarding the clustering of tags from improper
pairs and the identification of linked tag clusters. See
rearrangementType
for the type of rearrangement
supported by each read pair. See preprocessData
for constructing a list of elemented obtained from preprocessing.
## Load list of preprocessed data (see preprocessData)
data(pdata, package="trellis")
## Parameters for finding candidate rearrangements
rparam <- RearrangementParams(min_number_tags_per_cluster=5,
rp_separation=10e3)
## List of candidate rearrangements
rlist <- findCandidates2(pdata, rparam)
rlist
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.