Description Usage Arguments Details Value Author(s) Examples
View source: R/polymeraseWave.R
The model is a three state hidden Markov model (HMM). States represent: (1) the 5' end of genes upstream of the transcription start site, (2) up-regulated sequence, and (3) the 3' end of the gene through the polyadenylation site.
1 2 3 4 | polymeraseWave(reads1, reads2, genes, approxDist, size = 50,
upstreamDist = 10000, TSmooth = NA, emissionDistAssumption = "gamma",
filterWindowSize = 10000, progress = TRUE, within = TRUE,
BPPARAM = BiocParallel::bpparam())
|
reads1 |
Mapped reads in time point 1. |
reads2 |
Mapped reads in time point 2. |
genes |
A set of genes in which to search for the wave. |
approxDist |
The approximate position of the wave. Suggest using 2000 [bp/ min] * time [min], for mammalian data. |
size |
The size of the moving window. Suggest using 50 for direct ligation data, and 200 for circular ligation data. Default: 50. |
upstreamDist |
The amount of upstream sequence to include Default: 10 kb. |
TSmooth |
Optionally, outlying windows are set a maximum value over the inter-quantile interval, specified by TSmooth. Reasonable value: 20; Default: NA (for no smoothing). Users are encouraged to use this parameter ONLY in combination with the normal distribution assumptions. |
emissionDistAssumption |
Takes values "norm", "normExp", and "gamma". Specifies the functional form of the 'emission' distribution for states I and II (i.e. 5' of the gene, and inside of the wave). In our experience, "gamma" works best for highly-variable 'spiky' data, and "norm" works for smooth data. As a general rule of thumb, "gamma" is used for libraries made using the direct ligation method, and "norm" for circular ligation data. Default: "gamma". |
filterWindowSize |
Method returns 'quality' information for each gene to which a wave was fit. Included in these metrics are several that define a moving window. The moving window size is specified by filterWindowSize. Default: 10 kb. |
progress |
Whether to show progress bar. Default: TRUE |
within |
Whether the wave must be within the gene. Default: TRUE |
BPPARAM |
Registered backend for BiocParallel. Default: BiocParallel::bpparam() |
The model computes differences in read counts between the two conditions. Differences are assumed fit a functional form which can be specified by the user (using the emissionDistAssumption argument). Currently supported functional forms include a normal distribution (good for GRO-seq data prepared using the circular ligation protocol), a gamma distribution (good for 'spiky' ligation based GRO-seq data), and a long-tailed normal+exponential distribution was implemented, but never deployed.
Initial parameter estimates are based on initial assumptions of transcription rates taken from the literature. Subsequently all parameters are fit using Baum-Welch expectation maximization.
Reference: Danko CG, Hah N, Luo X, Martins AL, Core L, Lis JT, Siepel A, Kraus WL. Signaling Pathways Differentially Affect RNA Polymerase II Initiation, Pausing, and Elongation Rate in Cells. Mol Cell. 2013 Mar 19. doi:pii: S1097-2765(13)00171-8. 10.1016/j.molcel.2013.02.015.
Returns list of GRanges with Pol II wave positions and any BiocParallel errors caught.
Charles G. Danko
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | library(GenomicAlignments)
library(BiocParallel)
genes <- GRanges("chr7", IRanges(2394474,2420377), strand="+",
SYMBOL="CYP2W1", ID="54905")
reads1 <- as(readGAlignments(system.file("extdata", "S0mR1.bam",
package="groHMM")), "GRanges")
reads2 <- as(readGAlignments(system.file("extdata", "S40mR1.bam",
package="groHMM")), "GRanges")
approxDist <- 2000*10
## Often, HMMs fails to converge or distributions fail to fit. Therefore
## don't stop on error. SerialParam is being used here for building, but
## MulticoreParam or SnowParam are of course more desirable in practice.
bpparam <- SerialParam(stop.on.error = FALSE)
bpresult <- polymeraseWave(reads1, reads2, genes, approxDist, BPPARAM=bpparam)
## Collect successful fits.
gr <- unlist(List(bpresult[bpok(bpresult)]))
gr
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.