normalizeChIPtoInput: Normalize ChIP-Seq Read Counts to Input and Test for...

Description Usage Arguments Details Value Author(s)


Normalize ChIP-Seq read counts to input control values, then test for significant enrichment relative to the control.


normalizeChIPtoInput(input, response, dispersion=0.01, niter=6, loss="p", plot=FALSE,
                     verbose=FALSE, ...)
calcNormOffsetsforChIP(input, response, dispersion=0.01, niter=6, loss="p", plot=FALSE,
                       verbose=FALSE, ...)



numeric vector of non-negative input values, not necessarily integer.


vector of non-negative integer counts of some ChIP-Seq mark for each gene or other genomic feature.


negative binomial dispersion, must be positive.


number of iterations.


loss function to be used when fitting the response counts to the input: "p" for cumulative probabilities or "z" for z-value.


if TRUE, a plot of the fit is produced.


if TRUE, working estimates from each iteration are output.


other arguments are passed to the plot function.


normalizeChIPtoInput identifies significant enrichment for a ChIP-Seq mark relative to input values. The ChIP-Seq mark might be for example transcriptional factor binding or an epigenetic mark. The function works on the data from one sample. Replicate libraries are not explicitly accounted for, and would normally be pooled before using this function.

ChIP-Seq counts are assumed to be summarized by gene or similar genomic feature of interest.

This function makes the assumption that a non-negligible proportion of the genes, say 25% or more, are not truly marked by the ChIP-Seq feature of interest. Unmarked genes are further assumed to have counts at a background level proportional to the input. The function aligns the counts to the input so that the counts for the unmarked genes behave like a random sample. The function estimates the proportion of marked genes, and removes marked genes from the fitting process. For this purpose, marked genes are those with a Holm-adjusted mid-p-value less than 0.5.

The read counts are treated as negative binomial. The dispersion parameter is not estimated from the data; instead a reasonable value is assumed to be given.

calcNormOffsetsforChIP returns a numeric matrix of offsets, ready for linear modelling.


normalizeChIPtoInput returns a list with components


numeric vector of p-values for enrichment.


factor by which input is scaled to align with response counts for unmarked genes.


proportion of marked genes, as internally estimated

calcNormOffsetsforChIP returns a numeric matrix of offsets.


Gordon Smyth

Search within the edgeR package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.