Strand invaders | R Documentation |
findStrandInvaders
detects strand invasion artefacts in the
CTSS data. removeStrandInvaders
removes them.
Strand invaders are artefacts produced by template switching reactions
used in methods such as nanoCAGE and its derivatives (C1 CAGE, ...).
They are described in details in Tang et al., 2013. Briefly, these
artefacts create CAGE-like signal downstream of genome sequences highly
similar to the tail of template-switching oligonucleotides, which is
TATAGGG
in recent (2017) nanoCAGE protocols. Since these artefacts
represent truncated cDNAs, they do not indicate promoter regions. It is
therefore advisable to remove these artefacts. Moreover, when a sample
barcode is near the linker sequence (which is not the case in recent
nanoCAGE protocols), the strand-invasion artefacts can produce
sample-specific biases, which can be confounded with biological effects
depending on how the barcode sequences were chosen. A barcode
parameter
is provided to incorporate this information.
findStrandInvaders(object, distance = 1, barcode = NULL, linker = "TATAGGG")
removeStrandInvaders(object, distance = 1, barcode = NULL, linker = "TATAGGG")
## S4 method for signature 'CAGEexp'
findStrandInvaders(object, distance = 1, barcode = NULL, linker = "TATAGGG")
## S4 method for signature 'CAGEexp'
removeStrandInvaders(object, distance = 1, barcode = NULL, linker = "TATAGGG")
## S4 method for signature 'CTSS'
findStrandInvaders(object, distance = 1, barcode = NULL, linker = "TATAGGG")
## S4 method for signature 'CTSS'
removeStrandInvaders(object, distance = 1, barcode = NULL, linker = "TATAGGG")
object |
A |
distance |
The maximal edit distance between the genome and linker sequences. Regardless this parameter, only a single mismatch is allowed in the last three bases of the linker. |
barcode |
A vector of sample barcode sequences, or the name of a column
metadata of the |
linker |
The sequence of the tail of the template-switching
oligonucleotide, that will be matched with the genome sequence
(defaults to |
findStrandInvaders
returns a logical-Rle vector indicating the
position of the strand invaders in the input ranges.
With CTSS objects as input removeStrandInvaders
returns the
object after removing the CTSS positions identified as strand invaders.
In the case of CAGEexp
objects, a modified object is returned. Its sample
metadata is also updated by creating a new strandInvaders
column that
indicates the number of molecule counts removed. This value is subtracted
from the counts
colum so that the total number of tags is still equal to
librarySizes
.
Tang et al., “Suppression of artifacts and barcode bias in high-throughput transcriptome analyses utilizing template switching.” Nucleic Acids Res. 2013 Feb 1;41(3):e44. PubMed ID: 23180801, DOI: 10.1093/nar/gks112
# Note that these examples do not do much on the example data since it was
# not constructed using a protocol based using the template-switching method.
findStrandInvaders(exampleCAGEexp)
removeStrandInvaders(exampleCAGEexp)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.