Description Details Author(s) See Also Examples
A correct background estimation is crucial for calling enrichment and
differences in ChIP-seq data. normR
provides robust
normalization and difference calling in ChIP-seq and alike data. In brief, a
binomial mixture model with a given number of components is fit to read
count data for a treatment and control experiment. Therein, computational
performance is improved by fitting a log-space model via Expectation
Maximization in C++. Convergence is achieved by a threshold on
the minimum change in model loglikelihood. After the model fit has
converged, a robust background estimate is obtained. This estimate accounts
for the effect of enrichment in certain regions and, therefore, represents
an appropriate null hypothesis. This robust background is used to identify
significantly enriched or depleted regions with respect to control.
Moreover, a standardized enrichment for each bin is calculated based on the
fitted background component. For convenience, read count vectors can be
obtained directly from bam files when a compliant chromosome annotation is
given. Please refer to the individual documentations of functions for
enrichment calling (enrichR
), difference calling
(diffR
) and enrichment regime calling (regimeR
).
Available functions are
enrichR
: Enrichment calling between treatment
(e.g. ChIP-seq) and control
(e.g. Input).
diffR
: Difference calling between treatment
(e.g. ChIP-seq condition 1) and control
(e.g. ChIP-seq
condition 2).
regimeR
: Enrichment regime calling between treatment
(e.g. ChIP-seq) and control
(e.g. Input) with a
given number of model components. For example, 3 regimes recover background,
broad and peak enrichment.
The computational performance is improved by fitting a log-space model in C++. Parallization is achieved in C++ via OpenMP (http://openmp.org).
Johannes Helmuth helmuth@molgen.mpg.de
NormRFit-class
for functions on accessing and
exporting the normR fit. NormRCountConfig-class
for
configuration of the read counting procedure (binsize, mapping quality,...).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | require(GenomicRanges)
### enrichR(): Calling Enrichment over Input
#load some example bamfiles
input <- system.file("extdata", "K562_Input.bam", package="normr")
chipK4 <- system.file("extdata", "K562_H3K4me3.bam", package="normr")
#region to count in (example files contain information only in this region)
gr <- GRanges("chr1", IRanges(seq(22500001, 25000000, 1000), width = 1000))
#configure your counting strategy (see BamCountConfig-class)
countConfiguration <- countConfigSingleEnd(binsize = 1000,
mapq = 30, shift = 100)
#invoke enrichR to call enrichment
enrich <- enrichR(treatment = chipK4, control = input,
genome = gr, countConfig = countConfiguration,
iterations = 10, procs = 1, verbose = TRUE)
#inspect the fit
enrich
summary(enrich)
## Not run:
#write significant regions to bed
#exportR(enrich, filename = "enrich.bed", fdr = 0.01)
#write normalized enrichment to bigWig
#exportR(enrich, filename = "enrich.bw")
## End(**Not run**)
### diffR(): Calling differences between two conditions
chipK36 <- system.file("extdata", "K562_H3K36me3.bam", package="normr")
diff <- diffR(treatment = chipK36, control = chipK4,
genome = gr, countConfig = countConfiguration,
iterations = 10, procs = 1, verbose = TRUE)
summary(diff)
### regimeR(): Identification of broad and peak enrichment
regime <- regimeR(treatment = chipK36, control = input, models = 3,
genome = gr, countConfig = countConfiguration,
iterations = 10, procs = 1, verbose = TRUE)
summary(regime)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.