Description Usage Arguments Details Value Author(s) References See Also Examples
Reads input CAGE datasets into CAGEset object, constructs CAGE transcriptions start sites (CTSSs) and counts number of CAGE tags supporting every CTSS in each input experiment. Preprocessing and quality filtering of input CAGE tags, as well as correction of CAGE-specific 'G' nucleotide addition bias can be also performed before constructing TSSs.
1 2 3 |
object |
A |
sequencingQualityThreshold, mappingQualityThreshold |
Only CAGE tags with average sequencing quality |
removeFirstG |
Logical, should the first nucleotide of the CAGE tag be removed in case it is a G and it does not map to the referent genome (i.e. it is a mismatch). Used only if |
correctSystematicG |
Logical, should the systematic correction of the first G nucleotide be performed for the positions where there is a G in the CAGE tag and G in the genome. This step is performed in addition to removing the first G of the CAGE tags when it is a mismatch, i.e. this option can only be used when |
In the CAGE experimental protocol an additional G nucleotide is often attached to the 5' end of the tag by the template-free activity of the reverse transcriptase used to prepare cDNA (Harbers and Carninci, Nature Methods 2005). In cases where there is a G at the 5' end of the CAGE tag that does not map to the corresponding genome sequence, it can confidently be considered spurious and should be removed from the tag to avoid misannotating actual TSS. Thus, setting removeFirstG = TRUE
is highly recommended.
However, when there is a G both at the beginning of the CAGE tag and in the genome, it is not clear whether the original CAGE tag really starts at this position or the G nucleotide was added later in the experimental protocol. To systematically correct CAGE tags mapping at such positions, a general frequency of adding a G to CAGE tags can be calculated from mismatch cases and applied to estimate the number of CAGE tags that have G added and should actually start at the next nucleotide/position. The option correctSystematicG
is an implementation of the correction algorithm described in Carninci et al., Nature Genetics 2006, Supplementary Information section 3-e.
The slots librarySizes
, CTSScoordinates
and tagCountMatrix
of the provided CAGEset
object will be occupied by the information on CTSSs created from input CAGE files.
Vanja Haberle
Harbers and Carninci (2005) Tag-based approaches for transcriptome research and genome annotation, Nature Methods 2(7):495-502.
Carninci et al. (2006) Genome-wide analysis of mammalian promoter architecture and evolution, Nature Genetics 38(7):626-635.
1 2 3 4 5 6 7 8 9 10 | library(BSgenome.Drerio.UCSC.danRer7)
pathsToInputFiles <- system.file("extdata", c("Zf.unfertilized.egg.chr17.ctss",
"Zf.30p.dome.chr17.ctss", "Zf.prim6.rep1.chr17.ctss"), package="CAGEr")
labels <- paste("sample", seq(1,3,1), sep = "")
myCAGEset <- new("CAGEset", genomeName = "BSgenome.Drerio.UCSC.danRer7",
inputFiles = pathsToInputFiles, inputFilesType = "ctss", sampleLabels = labels)
getCTSS(myCAGEset)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.